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Description 

This invention relates to expression and expression followed by secretion of proteins from filamentous fungi. 
5 BACKGROUND OF THE INVENTION 

One goal of recombinant DNA technology is the insertion of DNA segments which encode commercially or scien- 
tifically valuable proteins into a host cell which is readily and economically available. Genes selected for insertion are 
normally those which encode proteins produced in only limited amounts by their natural, hosts or those which are 
io indigenous to hosts too costly to maintain. Transfer of the genetic information in a controlled manner to a host which 
is capable of producing the protein in either greater yield or more economically in a similar yield provides a more 
desirable vehicle for protein production. 

Genes encoding proteins contain promoter regions of DNA which are essentially attached to the 5' terminus of the 
protein coding region. Tne promoter regions contain the binding site for RNA polymerase II. RNA polymerase II effec- 
ts tively catalyses the assembly of the messenger RNA complementary to the appropriate DNA strand of the coding 
region. In most promoter regions, a nucleotide base sequence related to the sequence known generally as a "TATA 
box" is present and is generally disposed some distance upstream from the start of the coding region and is required 
for accurate initiation of transcription. Other features important or essential to the proper functioning and control of the 
coding region are also contained in the promoter region, upstream of the start of the coding region. 

Filamentous fungi, particularly the filamentous ascomycetes such as Aspergillus; e.g. Aspergillus niqer , represent 
a class of micro-organisms suitable as recipients of foreign genes coding for valuable proteins. Aspergillus niqer and 
related species are currently used widely in the industrial production of enzymes e.g. for use in the food industry. Their 
use is based on the secretory capacity of the microorganism. Because they are well characterized and because of 
their wide use and acceptance, there is both industrial and scientific incentive to provide genetically modified and 
enhanced cells of A. niger and related species including A. nidulans. in order to obtain useful proteins. 

Expression and secretion of foreign proteins from filamentous fungi has not yet been achieved. It is by no means 
clear that the strategies which have been successful in yeast would be successful in filamentous fungi such as As- 
pergillus. Evidence has shown that yeast is an unsuitable system for the expression of filamentous fungal genes (Pen- 
tilia et al Molec. Gen. Genet. (1 984) 1 94:494-499) and that yeast genes do not express in filamentous fungi. Genetic 
engineering techniques have only recently been developed for Aspergillus nidulans and Aspergillus niger . These tech- 
niques involve the incorporation of exogenously added genes into the Aspergillus genome in a form in which they are 
able to be expressed. 

To date no foreign proteins have been expressed in and secreted from filamentous fungi using these techniques. 
This has been due to a lack of suitable expression vectors and their constituent components. These components 
include Aspergillus promoter sequences described above, the region encoding the desired product and the associated 
sequences which may be added to direct the desired product to the extracellular medium. 

As noted, expression of the foreign gene by the host cell requires the presence of a promoter region situated 
upstream of the region coding for the protein. This promoter region is active in controlling transcription of the coding 
region with which it is associated, into messenger RNA which is ultimately translated into the desired protein product. 
Proteins so produced may be categorized into two classes on the basis of their destiny with respect to the host 

A first class of proteins is retained intracellularly. Extraction of the desired protein, when intracellular, requires that 
the genetically engineered host be broken open or lysed in order to free the product for eventual purification. Intracellular 
production has several advantages. The protein product can be concentrated i.e. pelleted with the cellular mass, and 
if the product is labile under extracellular conditions or structurally unable to be secreted, this is a desired method of 
45 production and purification. 

A second class of proteins are those which are secreted from the cell. In this case, purification is effected on the 
extracellular medium rather than on the cell itself. The product can be extracted using methods such as affinity chro- 
matography and continuous flow fermentation is possible. Also, certain products are more stable extracellularly and 
are benefited by extracellular purification. Experimental evidence has shown that secretion of proteins in eukaryotes 
is almost always dictated by a secretion signal peptide (hereafter called signal peptide) which is usually located at the 
amino terminus of the protein. Signal peptides have characteristic distributions as described by G. Von Heijne in Eur. 
J. Biochem 17-21 (1983) and are recognizable by those skilled in the art. The signal peptide, when recognized by the 
cell, directs the protein into the cell's secretory pathway During secretion, the signal peptide is cleaved off making the 
protein available for harvesting in its mature form from the extracellular medium. 

Both classes of protein, intracellular and extracellular, are encoded by genes which contain a promoter region 
coupled to a coding region. Genes encoding extracellularly directed proteins differ from those encoding intracellular 
proteins in that, in genes encoding extracellular proteins, the portion of the coding region nearest to the promoter (which 
is the first part to be transcribed by RNA polymerase) encodes a signal peptide. The nucleotide sequence encoding 
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the signal peptide, hereafter denoted the signal peptide coding region or the signal sequence, is operationally part of 
the coding region per se. 

SUMMARY OF THE INVENTION 



A system has now been developed by which filamentous fungi may be tranformed to express a desired protein. 
With this system, transformation can result in a filamentous fungus which is capable not only of expressing the protein 
but of secreting that protein as well, regardless of whether or not the protein is a naturally secreted one. In addition, 
the level at which the protein is expressed can be controlled according to certain aspects of the invention. It will be 
appreciated by those skilled in the art that the system provided permits filamentous fungi to function as valuable source 
of proteins and provides an alternative which in many applications is superior to bacterial and yeast systems. 

Thus, in a general aspect, the invention provides a filamentous fungus transformation system by which the genetic 
constitution of these fungus cells may be modified so as to alter either the nature of the amount of the proteins expressed 
by these cells. More specific aspects of the invention are defined below. 

According to the present invention there is provided a recombinant DNA construct comprising a regulatable pro- 
moter region of a filamentous fungus gene operably linked to a DNA fragment encoding the polypeptide, wherein: 

(a) said DNA fragment is foreign to the promoter region; and 

(b) said regulatable promoter region is inducible by ethanol said ethanol inducibility being mediated by the alcR 
gene product. 

There is yet further provided a method for producing a polypeptide which comprises: 

(i) culturing a filamentous fungus which has been transformed by a recombinant DNA construct in which a regu- 
latable promoter region of a filamentous fungus gene is operably linked to a DNA fragment encoding the polypep- 
tide, wherein: 

(a) said DNA fragment is foreign to the promoter region; and 

(b) said regulatable promoter region is inducible by ethanol said ethanol inducibility being mediated by the 
alcR gene product; 

(ii) under conditions in which regulated expression of the polypeptide occurs; and 

(iii) recovering said polypeptide. 

There is further provided a filamentous fungus which is transformed with a recombinant DNA construct of the 
invention. 

In the present invention, from one aspect, a promoter DNA region associated with a coding region in filamentous 
fungi such as A. niger , A. nidulans or a related species is identified and isolated, appropriately joined in a functional 
relationship with a second, different DNA coding region, outside the cell, and then re-introduced into a host filamentous 
fungus using an appropriate vector. Transformed host cells express the protein of the second coding region, under the 
control of the introduced promoter region. The second coding region may be one which is foreign to the host species, 
in which case the host will express and in some cases secrete a protein not naturally expressed by the given host. 

The present invention provides the ability to introduce foreign coding regions into filamentous fungi along with 
promoters to arrange for the host fungi to express different proteins. It also provides the ability to regulate transcription 
of the individual genes which occur naturally therein or foreign genes introduced therein, via the promoter region which 
has been introduced into the host along with the gene. For example, the promoter region naturally associated with the 
alcohol dehydrogenase I (alcA) gene and the aldehyde dehydrogenase (aldA ) gene of A. nidulans are regulatable by 
means of ethanol, threonine, or other inducing substances in the extracellular medium. This effect is dependent on the 
integrity of a gene known as alcR. When the alcA or aldA promoter region is associated with a foreign protein coding 
region in Aspergillus or the like, in accordance with the present invention, similar regulation of the expression of the 
different genes by ethanol or other inducers can be achieved. 

In another aspect, the present invention provides a genetic vector capable of introducing the segment carrying the 
above mentioned regulatable promoter and signal peptide coding region with integral protein coding region into the 
genome of a filamentous fungus host. The protein coding region can be either native to or foreign to the host filamentous 



The present invention thus also provides a novel construct comprising a DNA sequence the promoter region men- 
tioned above in cells of filamentous fungi, and a coding region chemically bound to said DNA sequence in operative 
association therewith, said coding region being capable of expression in a filamentous fungus host under influence of 
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said DNA sequence. 

The present invention further provides a process of genetically modifying a filamentous fungus host cell which 
comprises introducing into the host cell, by means of a suitable vector, a coding region capable of expression in the 
transformed Aspergillus host cell and a promoter region as described above active in the transformed Aspergillus host 
cell, the coding region which is foreign to the promoter regbn and the promoter being chemically bound together and 
in operative association with one another. 

This process also encompasses the introduction of multiple copies of the selected construct into the host to provide 
for enhanced levels of gene expression. If necessary or desirable, introduction of multiple construct copies is accom- 
panied by introduction of multiple copies of genes encoding products having a regulatory effect on the construct. 

The present invention also comprises filamentous fungal cells transformed by the constructs of the invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Preferred hosts according to the invention are the filamentous fungi of the ascomvcete class, most preferably 
Aspergillus sp.. including A. niger, A. nidulans and the like. 

In the preferred form, of the invention the promoter region associated with either the Aspergillus niger glucoamylase 
gene or the promoter region associated with either the alcohol dehydrogenase I gene or aldehyde dehydrogenase 
genes of Aspergillus nidulans is used in preparing an appropriate vector plasmid. 

Either or ail of these promoter regions is regulatable in the host cell by the addition of the appropriate inducer 
substance. In aJcAand aldA, this induction is mediated by the protein product of a third gene, alcR which is controlled 
via the promoter. Evidence indicates that the availability of alcR product can limit the promoting function of the alcA 
and aldA promoters when multiple copies of a construct containing the alcA promoter or the aldA promoter are intro- 
duced into a host without corresponding introduction of multiple copies of the alcR gene. In such a case, the amount 
of alcR product which the host can produce may be insufficient to meet the demands of the several promoters requiring 
induction by the alcR product. Thus, transformation of filamentous fungal hosts by multiple copies of constructs con- 
taining the alcA or aldA promoter is accompanied by introduction of multiple copies of the alcR gene, according to a 
preferred embodiment of the present invention. In other instances, transcription can be repressed, for example by 
utilizing high levels of glucose, (and some other carbon sources) in the medium to be used for growth of the host The 
expression of the product encoded by the coding region and controlled by the promoter is then delayed until after the 
end of the cell growth phase, when all of the glucose has been consumed and the gene is derepressed. The inducer 
may be added at this point to enhance the activity of the promoter. 

The destination of the protein product of the coding region which has been selected to be expressed under the 
control of the promoter described above is determined by the nucleotide sequence of that coding region. As mentioned, 
if the protein product is naturally directed to the extracellular environment, it will inherently contain a secretion signal 
peptide coding region. Protein products which are normally intracellular^ located lack this signal peptide. 

Thus, for the purposes of the present disclosure it is to be understood that a 'coding region" encodes a protein 
which is either retained intracellular^ or is secreted. (This "coding region" is sometimes referred to in the art as a 
structural gene i.e. that portion of a gene which encodes a protein.) Where the protein is retained within the cell that 
produces it, the coding region will usually lack a signal peptide coding region. Secretion of the protein encoded within 
the coding region can be a natural consequence of cell metabolism in which case the coding region inherently contains 
a signal peptide coding region linked naturally in translation reading frame with that segment of the coding region which 
encodes the secreted protein. In this case, insertion of a signal peptide coding region is not required. In the alternative, 
the coding region may be manipulated to introduce a signal peptide coding region which is foreign to that portion of 
the coding region which encodes the secreted protein. This foreign signal peptide coding region may be required where 
the coding region does not naturally contain a signal peptide coding region or it may simply replace the natural signal 
peptide coding region in order to obtain enhanced secretion of the desired protein with which the natural signal peptide 
is normally associated. 

In accordance with another preferred aspect of the invention, therefore, a signal peptide coding region is provided, 
if required i.e. when the coding region which has been selected to be expressed under the control of the promoter 
described above does not itself contain a signal peptide coding region. The signal peptide coding region used is pref- 
erably either one which is associated with the Aspergillus nioer glucoamylase gene or a synthetic signal peptide coding 
region which is made in vitro and used in the preparation of an appropriate vector plasmid. Most preferably, these 
signal peptide coding regions are modified at one or both termini to permit ligation thereof with other components of a 
vector. This ligation is effected in such a way that the signal peptide coding region is interposed between the promoter 
region and the protein encoding segment of the coding region such that the signal peptide coding region is in frame 
with that segment of the coding region which encodes the mature, functional protein. 
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BRIEF REFERENCE TO THE DRAWINGS 

Figure 1 A is an illustration of the base sequence of the DNA constituting the coding region and promoter region 
of the alcohol dehydrogenase I (alcA) gene of Aspergillus nidulans . 

5 

Figure 1 B is an illustration of the base sequence of DNA constituting the coding region and promoter region of the 
aldehyde dehydrogenase (a Id A ) gene of Aspergillus nidulans . 

Figure 2 is a diagrammatic illustration of a process of constructing plasmid pDG6 useful in transforming a filamen- 
10 tous fungal cell; 

Figure 3 is a linear representation of a portion of the plasmid pDG6 of Fig.2; 

Figure 4 is a diagrammatic illustration of the plasmid maps of pGLI and pGL2; 

75 

Figure 5 is an illustration of a selection of synthetic linker sequences for insertion into plasmid pGL2; 
Figure 6 is an illustration of the nucleotide sequence of a fragment of pGL2; 
20 Figure 7 is an illustration of plasmid map pGL2B and pGL2BIFN; 

Figure 8 is an illustration of the nucleotide sequence of a fragment of pGL2BIFN; 
Figure 9 illustrates plasmid pALCAIS and a method for its preparation; 

25 

Figure 10 illustrates the plasmid map of pALCAISIFN and a method for its preparation; 
Figure 11 represents the nucleotide sequence of a fragment of pALCAISIFN; 
oo Figure 12 illustrates the plasmid map of pGL2CENDO; 

Figure 13 represents the nucleotide sequence of a fragment of pGL2CENDO; 
Figure 14 represents a plasmid map of pALCAISENDO; 

35 

Figure 15 represents the nucleotide sequence of a fragment of pALCAISENDO; 

Figure 16 illustrates plasmid pALCAIAMY and a method for its preparation; and 

40 Figure 17 represents the nucleotiae sequence of a segment of pALCAIAMY shown in Figure 16. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

In the present invention, an appropriate promoter region of a functioning gene in A. niger or A. nidulans or the like 
45 is identified. Procedures for identifying each of the genes containing the desired promoter regions are similar and for 
that reason, the manner of locating and identifying the alcA gene and promoter therein is outlined. For this purpose, 
cells of the chosen species are induced to express the selected protein e.g. alcA, and from these cells is isolated the 
messenger RNA. One portion thereof, as yet unidentified codes for alcA. Complementary DNA for the fragments is 
prepared from the mRNA fragments and cloned into a vector. Messenger RNA isolated from induced A. nidulans is 
50 size fractionated to enrich for alcA sequences, end labelled and hybridized to the cDNA clones made from the alcA * 
strain. That clone containing the cDNA which hybridizes to alcA* mRNA contains the DNA copy of the alcA mRNA. 
This piece is hybridized to a total DNA gene bank from the chosen Aspergillus species, to isolate the selected coding 
region e.g. alcA and its flanking regions. The aldA coding region was isolated using analogous procedures. 

The coding region starts at its 5' end, with a codon ATG coding for methionine, in common with other coding regions 
55 and proteins. Where the amino acid sequence of the expressed protein is known, the DNA sequence of the coding 
region is readily recognizable. Immediately "upstream" of the ATG codon is the leader portion of the messenger RNA 
preceded by the promoter region. 

With reference to Figs. 1 A and.1 B, these show portions of the total DNA sequence from A. nidulans , with conven- 
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tional base notations. The portion shown in Figure 1 A contains the promoter region and the coding region of the alcA 
gene which encodes the enzyme alcohol dehydrogenase I. The portion shown in the Figure 1 B contains the promoter 
region and the region encoding the enzyme aldehyde dehydrogenase i.e. aldA. (In both cases, the term "I VS" represents 
intervening sequences.) The amino acid sequences of these two enzymes is known in other species. From these, the 
5 regions 10 and 10' are recognisable as the coding regions. Each coding region starts at its 5' ("upstream") end with 
methionine codon ATG at 1 2. The appropriate amino acid sequences encoded by the protein coding region are entered 
below the respective rows on Figs. 1 A and 1B, in conventional abbreviations. Immediately upstream of codon 12 is 
the region coding for the messenger RNA leader and the promoter region, the length of which, in order to contain all 
the essential structural features enabling it to function as a promoter, now needs to be determined or at least estimated. 
io Each of Figures 1 A and 1B shows a sequence of about 800 bases in each case, upstream from the ATG codon 12. 

It is predictable from analogy with other known promoters that all the functional essentials are likely to be contained 
within a sequence of about 1000 bases in length, probably within the 800 base sequence illustrated, and most likely 
within the first 200 - 300 base sequence, i.e. back to about position 14 on Figs. 1 A and 1 B. An essential function of a 
promoter region is to provide a site for accurate initiation of transcription, which is known to be a TATA box sequence. 
Such a sequence is found at 16 on the aJcA promoter sequence of Fig. 1 A, and at 1 6' on the aldA promoter sequence 
of Fig. 1 B. Another f unction of a promoter region is to provide an appropriate DN A sequence active in regulation of the 
gene transcription, e.g. a binding site for a regulatory molecule which enhances gene transcription, or for rendering 
the gene active or inactive. Such regulator regions are within the promoter region illustrated in Figs. 1 A and 1 B for the 
alcA and aldA genes, respectively. 

The precise upstream 5* terminus of the DNA sequence used herein as a promoter region is not critical, provided 
that it includes the essential functional sequences as described herein. Excess DNA sequences upstream of the 5' 
terminus are unnecessary, but unlikely to be harmful in the present invention. 

Having determined the extent of the sequence containing all the essential functional features to constitute a pro- 
moter region of the given gene, by techniques described herein, the next step is to cut the DNA chain at a convenient 
location downstream of the promoter region terminus and to remove the protein coding region, to leave basically a 
sequence comprising the promoter region and sometimes part of the region coding for the messenger RNA leader. 
For this purpose, appropriately positioned restriction sites are to be located, and then the DNA treated with the appro- 
priate restriction enzymes to effect scission. Restriction sites are recognizable from the aJcA sequence illustrated in 
Figure 1 A. For the upstream cutting, a site is chosen sufficiently far upstream to include in the retained portion all of 
the essential functional sites for the promoter region. As regards the downstream scission, no restriction site presents 
itself exactly at the ATG codon 12 in the case of alcA. The closest downstream restriction site thereto is the sequence 
GGGCCC at 1 3, at wh ich the chain can be cut with restriction enzyme Apa 1. If desired, after such scission, the remaining 
nucleotides from location 1 3 to location 1 2 can be removed, in stepwise fashion, using an exonuclease. With knowledge 
of the number of such nucleotides to be removed, the exonuclease action can be appropriately stopped when the 
location 1 2 is passed. By locating a similar restriction site downstream of the methionine codon 1 2 of the aldA coding 
region shown in Fig. 1B, this promoter region is similarly excised for subsequent use. In many cases, residual nucle- 
otides on the 5' terminus of the promoter region are not harmful to and do not significantly interfere with the functioning 
of the promoter region, so long as the reading frame of the base triplets is maintained. 

Fig. 2 of the accompanying drawings illustrates diagrammatically the steps in a process of preparing plasmid pDG6 
which can be used to create Aspergillus transformants according to the present invention. On Fig. 2, 1 8 is a recombinant 
plasmid containing the endogluconase (cellulase) coding region 30 from the bacterium Cellulomonas fimi , namely a 
BamHI endoglucanase fragment from C. fimi in known vector M13MP8. It contains relevant restriction sites for EcoRI, 
Hind III and BamHI as shown as well as others not shown and not of consequence in the present process. Item 20 is 
a recombinant plasmid designated p5, constructed from known E. coll plasmid pBR322 and containing an EcoRI frag- 
ment of A. nidulans containing the alcA promoter region prepared as described above, along with a small portion of 
the alcA coding region, including the start codon ATG. It has restriction sites as illustrated, as well as other restriction 
sites not used in the present process and so not illustrated. Plasmid p5 contains a DNA sequence 22, from site EcoRI 
(3') to site Hind III (5'), which is in fact a part of the sequence illustrated on Fig. 1 A, upper row, from position 15 (the 
sequence GAATTC thereat constituting an EcoRI restriction site) to position 17. Sequence 22 in plasmid 20 is approx- 
imately 2 kb in length. 

The plasmids 1 8 and 20 are next cut with restriction enzymes EcoRI and Hind III, so as to excise the ale A promoter 
region and the endoglucanase coding, region 30 which are ligated to Hind Ill-cut plasmid pUC12, to form a novel 
construct pDG5A containing these sequences on pUC12, as shown in Fig. 2. Plasmid pUC12 is a known, commercially 
available E. coli plasmid, which replicates efficiently in E. col^ so that abundant copies of pDG5A can be made if 
desired. Novel construct pDGSA is isolated from the other products of the construct preparation. Next, construct pDG5A 
is provided with a selectable marker so that subsequently obtained transformants of Aspergillus into which the construct 
has successfully entered can be selected and isolated. In the case of Arg B ~ Aspergillus hosts, one can suitably use 
an ArqB gene from A. nidulans for this purpose. The Arg B gene codes for the enzyme ornithine trahscarbamylase, 
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and strains containing this gene are readily selectable and isolatable from Arg B' strains by standard plating out and 
cultivation techniques. Arg B ~ strains will not grow on a medium lacking arginine. 

To incorporate a selectable marker, in this embodiment of the invention as illustrated in Fig. 2, construct pDGSA 
may be ligated with the Xba I fragment 32 of plasmid pDG3 ATCC 53006 (see U.S. patent application serial number 
5 06/678,578 Buxton et al, filed December 5, 1984) which contains the Arg B + gene from A. nidulans using Xba I, to form 
novel construct pDG6, which contains the endoglucanase coding region, the ale A promoter sequence and the Arg B 
gene. Plasmid pDG6 is then used in transformation, to prepare novel Aspergillus mutant strains containing an endog- 
lucanase coding region under the control of alcA promoter, as described in more detail in Example 1 . 

Fig. 3 shows in linear form the diagrammatic sequence of the functional portion of construct pDG6, from the Hind 
10 III site 24 to the Hind III site 26. It contains the alcA promoter region 22, the ATG codon 12 and a small residual portion 
of the alcA coding region downstream of the ATG codon as shown in Fig. 1 , followed by the cellulase coding region 
30 derived from plasmid 18. 

Plasmid pDG6 is but one example of a vector which contains a filamentous fungal promoter linked to a protein 
coding region foreign to the fungus. In another vector exemplified herein with reference to Figure 16 and referred to 

is herein as pALCAl AMY, the filamentous fungal promoter of the alcA gene is coupled with the naturally occuring sequence 
coding for the ct-amylase enzyme, a product which is foreign to the transformed fungal host. The protein products of 
vectors pDG6 and pALCAIAMY are expressed by the respective transformed hosts in both cases. Further, because 
the a-amylase coding region naturally contains a signal peptide coding region, this product can be secreted by the 
transformed host, using the secretory machinery of the host, despite its foreign relationship with the host. 

20 Identifying and isolating the promoter regions of filamentous fungi thus allows one to manipulate the host by trans- 

formation with vectors containing these promoter regions coupled with a desired coding region. 

If the coding region of the vector requires a signal peptide coding region or the existing signal sequence is to be 
replaced by a different, preferably more efficient signal peptide coding region, such signal peptide coding regions may 
be integrated between the promoter and that segment coding for the secreted protein. Plasmids pGL2 (Figure 4) and 

25 pALCAIS (Figure 9) represent intermediate cloning vectors particularly suited for this purpose. Each can function as a 
cassette, providing a promoter, a signal sequence and a restriction site downstream of the signal sequence which 
permits insertion of a protein coding region in proper, transcriptional reading frame with the signal sequence. 

Plasmid pGL2 shown in Figure 4 is created from pGLI which contains the promoter 40, the signal sequence 42 
and an initial portion 46 of the glucoamylase gene, all of which were derived in one segment fromA. niger DNA according 

30 to methods exemplified herein. In this segment, a BssHI I restriction site is available toward the end of, but nevertheless 
within the glucoamylase signal sequence 42, the nucleotide sequence of which is reproduced below in chart 1. 



35 Chart 1 

5 1 ATG TCG TTC CGA TCT CTA CTC GCC CTG AGC GGC CTC GTC TGC 
4Q met ser phe arg ser leu leu ala leu ser gly ley val cys 



45 



BSSK II 

ACA GGG TTG GCA AAT GTG ATT TCC AAG CGC 
thr gly leu ala asn val ile ser lys arg 



In order to provide a segment downstream of the signal sequence i.e. a linker 44, capable of receiving a protein 
coding region in reading frame with the signal sequence 42, advantage is taken of the presence of the BssH II site 

50 within the signal sequence and the Sst I site downstream thereof. In this specific embodiment, segment 46 is excised 
from pGLI and replaced with a selected one of three linkers shown in Figure 5 and denoted A, B or C. Each linker is 
able to ligate with the BssH II end and the Sst I end. The linkers are also engineered so as to restore the terminal 
codons of the signal sequence lost upon excision of segment 46 with BssH II. Further, each linker defines unique 
EcoRV and Bgl 11/ Xho II sites within its nucleotide sequence so as to permit insertion of the desired coding region into 

55 the vector pGL2. 

Selection of the appropriate linker is made with knowledge of at least the first few codons of the protein coding 
region to be inserted into the linker. In order for the protein coding region to be translated sensibly, the start of the 
protein coding region must be either directly coupled with or be a specific number of nucleotides i.e. in triplets, from 
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the start of the signal sequence. Accordingly, if the protein coding region to be inserted possesses one or two unes- 
sential nucleotides (or a non-triplet factor thereof) at its 5' region as may result from routine excision, one of the three 
linkers shown in Figure 5 can compensate for the presence of the extra, superfluous nucleotides and locate the start 
of the protein coding region in translational reading frame with the signal sequence. 

The amino acid residues encoded by the linkers A, B and C appear under their nucleotide sequences as shown 
in Figure 5, from which the effect of adding an additional nucleotide to the linker sequence on the reading frame of the 
linker and ultimately on the inserted protein encoding region may be noted. By designing the linkers such that the 
restriction site is always downstream of the reading frame modification i.e. one, two or three adenine residues in linkers 
A, B or C respectively, the reading frame of the coding region inserted into the restriction site can be maintained by 
appropriate linker selection. 

Exemplary of plasmids which employ plasmid pGL2 and specific linker segments A, B, or C are pGL2BIFN which 
employs the B linker and results in interferon ct-2 secretion when used in filamentous fungus e.g. Aspergillus sp. , 
transformation and pGL2CENDO which employs the C linker and results in endoglucanase secretion when such fila- 
mentous fungi are transformed therewith. 

While plasmid pGL2 utilizes a naturally occurring signal sequence, it is within the scope of the invention also to 
utilize vectors containing synthetic signal sequences. An example of one such vector is pALCAIS which, like plasmid 
pGL2, represents an intermediate vector within which a protein coding region may be inserted to form a vector capable 
of transforming filamentous fungi. Unlike pGL2 however, pALCAIS utilizes the alcA promoter and utilizes a synthetic 
signal sequence coupled to that promoter. pALCAIS is illustrated in Figure 9 which shows a scheme for preparing it 
and to which further reference is made in the examples. Exemplary of plasmids created from pALCAIS are pALC AISI FN 
which results in secretion of interferon a-2 from a filamentous fungus transformed therewith and pALCAISENDO which 
results in secretion of endoglucanase from a filamentous fungus host. In both instances secretion is obtained despite 
the foreign nature of the secreted protein with respect to the host. 

The invention is further described and illustrated by the following specific, non-limiting examples. 

Each of Examples 1 and 2 which follow exemplify successful transformation of a filamentous fungal host using 
vectors having a filamentous fungus-derived promoter coupled with naturally occurring but non-fungal coding regions. 

Example 1 - Transformation of A. nidulans using pDG6 ATCC 53169 

The vector construct pDG6 shown in Figure 2 was first prepared following the process scheme illustrated in Figure 
2, using standard routine ligation and restriction techniques. Then the construct pDG6 was introduced into Arg B ~ 
mutant cells of Aspergillus nidulans as follows: 

500 mis of complete media (Cove 1966) + 0.02% arginine + ia 5 % biotin in a 2 1 conical flask was innoculated 
with 10 5 conidia/ml of an A. nidulans Arg B- strain and incubated at 30°C, shaking at 250 rpm for 20 hours. The mycelia 
were harvested through Whatman No. 54 filter paper, washed with sterile deionized water and sucked dry. The mycelia 
were added to 50 ml of filter sterile 1 .2 M MgS0 4 10 mM potassium phosphate pH 5.8 in a 250 ml flask to which was 
added 20 mg of Novozym 234 (Novo Enzyme Industries), 0,1 ml (=1 5000 units) of p-glucuronidase (Sigma) and 3 mg 
of Bovine serum albumin for each gram of mycelia. Digestion was allowed to proceed at 37°C with gentle shaking for 
50-70 minutes checking periodically for spheroplast production by light-microscope. 50 mis of sterile deionised water 
was added and the spheroplasts were separated from undigested fragments by filtering through 30 urn nylon mesh 
and harvested by centrifuging at 2500 g for 5 minutes in a swing out rotor in 50 ml conical bottom tubes, at room 
temperature. The spheroplasts were washed, by resuspending and centrifuging, twice in 10 mis of 0.6 M KCI. The 
number of spheroplasts was determined using a hemocytometer and they were resuspended at a final concentration 
of 10*/ml in 1.2 M Sorbitol, 10 mM Tris/HCI, 10 mM CaCI 2 pH 7.5. Aliquots of 0.4 ml were placed in plastic tubes to 
which DNA pDG6 (total vol. 40 ul in 10 mM Tris/HC1 1 mM EDTA pH 8) was added and incubated at room temperature 
for 25 minutes. 0.4 ml, 0.4 ml then 1 .6 ml aliquots of 60% PEG4000, 10 mM Tris/HCI, 10 mM CaCfe pH 7.5 were added 
to each tube sequentially with gentle, but thorough mixing between each addition, followed by a further incubation at 
room temperature for 20 minutes. The transformed spheroplasts were then added to appropriately supplemented min- 
imal media 1% agar overlays, plus or minus 0.6 M KCI at 45°C and poured immediately onto the identical (but cold) 
media in plates. After 3-5 days at 37°C the number of colonies growing was counted (F. Buxton et al), Gene 37. 207-214 
(1 985)). The method of Yelton et al fProc. Nat'l Acad. Sci. U.S.A. 81 ; .1 370-1 374 (1 980)] was also used. 

The colonies were divided into two groups. Threonine (11.9 g/Liter) and fructose (1 g/Litre) were added to the 
incubation medium for one group to induce the cellulase gene incorporated therein. No inducer was added to the other 
group, which were repressed by growth on minimal media with glucose as sole carbon source. Both groups were 
assayed for general protein production by BioRad Assay, following cultivation, filtering to separate the mycelia, freeze 
drying, grinding and protein extraction with 20 mM Tris/HCI at pH 7. 

To test for production of cellulase, plates of Agar medium containing cellulase (9 g/Lt, carboxymethylcellulose) 
were prepared, and small pieces of glass fibre filter material, isolated from one another, and 75 ug of total protein from 
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one of the transconjugants was added to each. of the filters. The plates were incubated overnight at 37°C. The filters 
were then removed, and the plates stained with congo red to determine the locations where cellulase had been present 
in the total protein on the filters, as evidenced by the breakaown of cellulase in the agar medium below. The plates 
were de-stained, by washing with 5M NaCI in water, to detect the differences visibly. 

Of four transformants induced with threonine and fructose, three clearly showed the presence of cellulase in the 
total protein product. The non-induced, glucose repressed transformants did not show evidence of cellulase production. 

Three control transformants were also prepared from the same vector system and strains, but omitting the promoter 
sequence. None of them produced cellulase, with or without inducers. The presence of C. firm endoglucanase coding 
region was verified by the fact that medium from threonine-induced transformed strains showed reactivity with a mon- 
oclonal antibody raised against C. fimi endoglucanase. This monoclonal antibody showed no cross-reactivity with en- 
dogenous A. nidulans proteins in control strains. 

Example 2 - Transformation of A. nidulans using pALCAIAMY ATCC 53380 

The vector construct pALCAIAMY was prepared as indicated in Figure 16, using standard routine ligation and 
restriction techniques. In particular, and with reference to Figure 1 6 vector pALCAl containing a Hind HI-EcoRI segment 
in which the A. nidulans alcohol dehydrogenase 1 promoter 22 is located (as described previously), was cut at its EcoRI 
site in order to insert the coding region of the wheat a-amylase gene 72 contained within an EcoRlrEcoRI fragment 
defined on plasmid p501 (see S.J. Rothstein et al, Nature, 308, 662-665 (1984)). As wheat a-amylase is a naturally 
secreted protein, its coding region 72 contains a signal peptide coding region 76 and a segment 78 which encodes 
mature, secreted a-amylase. Ligation of coding region 72 contained in the EcoRI-EcoRI segment of p501 within the 
EcoRI-cut site of pALCAl provides plasmid pALCAIAMY in which the AIcA promoter 22 is operatively associated with 
the a-amylase coding region. The correct orientation of the p501 -derived a-amylase coding region within pALCAIAMY 
is confirmed by sequencing across the ligation site according to standard procedures. The nucleotide sequence of the 
promoter/coding region junction is shown in Figure 17. 

A. nidulans may be transformed by the procedure described in example 1 , samples of extracellular medium being 
taken from and applied to glass fibre filter papers placed on 1% soluble starch agar. The filters are then removed after 
8 hours at 37°C and inverted onto beakers containing solid iodine (in a 50°C water bath). Clear patches indicate starch 
degradation while the remaining starch turns a deep purple, thereby confirming the presence of secreted a-amylase. 

In examples 3-1 2 which follow, vectors are provided in which a secretion signal peptide coding region is introduced 
in the vector in order to obtain secretion of a foreign protein from a filamentous fungus transformed by the entire vector. 

Example 3 - Production of Plasmid pGL2, an intermediate vector 

A) Source of promoter and signal peptide sequence 

The glucoamylase gene of A. niger was isolated by probing a gene bank derived from DNA available in a strain 
of this microorganism on deposit with ATCC under catalogue number 22343. The probing was conducted using oligo- 
nucleotide probes prepared with Biosearch oligonucleotide synthesis equipment and with knowledge of the published 
amino acid sequence of the glucoamylase protein. The amino acid sequence data was "reverse translated' to nucleotide 
sequence data and the probes synthesized. The particular gene bank probed was a Sau 3A partial digest of the A. 
niger DNA described above cloned into the Bam HI site of the commercially available plasmid pUC12 which is both 
viable in and replicable in E. Coli. 

A Hind III -Bgl II piece of DNA containing the glucoamylase gene was subcloned into pUC12. Subsequently, the 
location of the desired promoter region, signal peptide coding region and protein coding region of the glucoamylase 
was identified within pUC1 2 containing the sub-cloned fragment. The EcoRI/EcoRI fragment (see Figure 4) was shown 
to contain a long, open translation reading frame when it was sequenced and the sequence data was analyzed using 
the University of Wisconsin sequence analysis programmes. 

Results of analysis of the nucleotide sequence of part of the region of the glucoamylase gene between the 5' Eco 
Rl site and BssH 1 1 3' site within the Hind III - Bgl II fragment are shown in Figure 6. This region contains the glucoamylase 
promoter and the signal peptide coding region. 

Within this fragment i.e. at nucleotides 97-1 02 is a "TATA box" 48 which provides a site required by many eukaryotic 
promoter regions for accurate initiation of transcription (probably an RNA polymerase II binding site). Accordingly the 
presence of at least a portion of the promoter region is confirmed. Further, it is predictable from analogy with other 
known promoter regions that all the functional essentials are likely to be contained within a sequence of about 1,000 
bases in length and most likely within the first 200 - bases upstream of the start codon for the coding region i.e. nucle- 
otides 206 - 208 or "ATG" 49, the codon for methionine. Thus, the promoter and transcript leader terminate at nucleotide 
205. The identity of the beginning of the promoter region is less crucial although the promoter region must contain the 
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RNA polymerase II binding site and all other features required for its function. Thus, whereas the Eco Rl-Eco Rl se- 
quence is believed to represent the entire promoter region of the glucaomylase gene, the fragment used in plasmid 
pGL2 contains this fragment in the much larger Hind III - BamH l/Bgl II segment to ensure that the entire promoter 
region is properly included in the resultant plasmid. 

On the basis that the amino acid sequence of mature glucoamylase is known (see Svensson et al, "Characterization 
of two forms of glucoamylase from Aspergillus niger", Carlsberg Res. Commun. 47, 55-69 (1982)), a nucleotide se- 
quence of the signal peptide can be determined accurately: The signal peptide coding region of genes encoding secreted 
proteins is known to initiate with the methionine residue encoded by the ATG codon 49. Determination of a sufficient 
initial portion of the nucleotide sequence beyond i.e. 3' of the ATG codon provides information from which the amino 
acid sequence of that portion may be determined. By comparison of this amino acid sequence with the published amino 
acid sequence, the signal peptide can be identified as that portion of the glucoamylase gene which has no counterpart 
in the published sequence with which it was compared. The glucoamylase signal peptide coding region defined herein 
was previously confirmed using this method. 

By the above methods, the Hind ill - Bam Hl/Bgl II fragment resulting from Sau 3A partial digestion and incorporated 
into pUC 1 2 was confirmed to contain the following features of the glucoamylase gene: an initial, perhaps non-relevant 
section, the promoter region, the signal peptide coding region and the remaining portion of the coding region. This 
fragment, inserted into the pUC1 2 plasmid by scission with Hind III and Bam Hl/Bgl II and ligation appears schematically 
in Figure 4 as plasmid pGLI. This plasmid contains all of the features necessary for replication and the like in order to 
remain selectable and replicable in E. Coli. 

B) Construction of Plasmid pGL2 

Using pGLI as a precursor, plasmid vector pGL2 can be formed as shown in Figure 4. The restriction site BssH II 
near the 3' end of the signal sequence 42, is utilized together with the unique downstream Sst I site in order to insert 
a synthetic linker sequence A, B, or C defined in Figure 5 herein. Thus, pGLI is cleaved with both BssH II and Sst I 
thereby removing the initial portion of the glucoamylase coding region 46 contained therein. Thereafter a selected one 
of the synthetic leader sequences A through C having been designed so as to be flanked by BssH ll/Sst I compatible 
ends is inserted and ligated, thereby generating plasmid pGL2. Depending on which of the three linker sequences is 
used i.e. A, B or C, the resultant plasmid will hereinafter be identified as pGL2A, pGL2B or pGL2C, respectively. 

The synthetic linker sequences identified herein are each equipped with unique Eco RV and Bgl II restriction sites, 
as shown in Figure 5, into which a desired protein coding region may be inserted. Once inserted, the resultant plasmid 
may be used to transform a host e.g. A niqer, A. nidulans and the like. The presence of the promoter region and the 
signal peptide coding region both of which are recognized by the host, provide a means whereby expression of the 
protein coding region and secretion of the protein so expressed is made possible. 

Example 4 - Use of Plasmid pGL2 in creating PGL2BIFN 

An example of the utility of the plasmid pGL2 is described below with reference to Figure 7, which shows sche- 
matically the construction of plasmid pGL2BIFN from pGL2B. 

The plasmid pGL2B is prepared as described in general previously for pGL2 save that synthetic linker sequence 
*B" shown in Figure 5 is inserted specifically. The reference numeral 44 has accordingly been modified in Figure 7 to 
read '44B'. In order to make available an opening in the vector pGL2B, the plasmid is cut with Eco RV at the site 
internal to linker 44B. The scission results in blunt ends which may be ligated with a fragment flanked by blunt ends 
using ligases known to be useful for this specific purpose. 

In the embodiment depicted in Figure 7, a fragment 60 containing the coding region of human interferon a-2 is 
inserted to create pGL2BIFN. Specifically, a Dde I - Bam HI fragment 60 containing the coding region coding for human 
interferon a- 2 was excised from plasmid pN5H8 (not shown) on the basis of the known sequence and restriction map 
of this gene. 

The plasmid pN5H8 combines known plasmid pAT153 with the interferon gene at a Bam HI site. The interferon 
gene therein is described by Slocomb, et. al. , "High level expression of an interferon a-2 gene cloned in phage M1 3mp7 
and subsequent purification with a monoclonal antibody'' Proceedings of the National Academy of Sciences . U.S.A., 
Vo. 79 pp 5455-5459 (1982) 

In order to anneal the sticky ends of the interferon fragment into the cut Eco RV site of pGL2B, the sticky Dde I 
and Bam HI ends are filled using reverse transcriptase and ligated with an appropriate ligase according to techniques 
standard in the art. 

The advantage of selecting linker sequence B for insertion into pGL2 is manifest from Figure 8 which shows the 
reading frame of the interferon a-2 coding region and its relationship with the recreated signal peptide sequence, in 
terms of nucleotide sequence and amino acid sequence, where appropriate. 
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Figure 8 shows a portion of the promoter region 40 5' of the signal sequence joined with a portion of the glucoamy- 
lase signal peptide sequence 42 beginning with the methionine codon ATG at 49 and ending with the lysine codon 
AAG at 50. In fact, although the signal peptide coding region extends one residue further i.e. to the CGC codon for 
arginine at 52, this latter residue is comprised by the synthetic linker sequence 44B engineered so as to compensate 
for the toss of the arginine residue during scission and ligation to insert the linker sequence. In this way, the genetic 
sequence of the signal remains undisturbed. 

In a similar manner, the linker sequence provides for insertion of the interferon a-2 coding region without altering 
the reading frame thereof. With reference to Figures 7 and 8 cleavage of linker sequence 44B by Eco RV results in 
linker fragments 44B 1 and 44B" having blunt ends. Excision of the interferon a-2 coding region at Dde I site results, 
after filling in of the sticky ends created by the enzyme, in the desired nucleotide sequence without harming the se- 
quence of that coding region. Ligation within the Eco RV-cleaved linker sequence of the interferon sequence filled at 
the Dde I site maintains the natural reading frame of the interferon coding region as evidenced by the triplet codon 
state between the linker portion 44B* and the interferon coding region 60. Had the linker A shown in Figure 5 been 
chosen, which bears one less nucleotide than the linker B, the entire reading frame would have been shifted by one 
nucleotide resulting in a nonsense sequence. By selection of synthetic linker B, codons are made available between 
the signal peptide sequence and the interferon coding region which do not alter the reading frame of the coding region, 
when the blunt ended IF a-2 fragment is oriented correctly. The correct orientation is selected by sequencing clones 
with inserts across the ligation junction. 

Example 5 - Expression and Secretion from A. nidulans Transformed with pGL2BIFN ATCC 53371 

The plasmid pGL2BIFN was cotransformed i.e. with a plasmid containing Arg B+ gene as described more fully in 
U.S. patent application serial No. 678,578 filed December 5, 1984 into an Arg B - strain of A. nidulans with a separate 
plasmid containing an arg B selectable marker. Arg B + transformants were selected of which 18 of 20 contained 1 - 
100 copies of the human interferon a-2 coding region (as detected by Southern blot analysis). 

Several transformants were grown on starch medium to induce the glucoamylase promoter and the extracellular 
medium was assayed for human IF a-2 using the CellTech IF a-2 assay kit. 

All transformants exhibited some level of synthesis and secretion of assayable protein. Two controls, the host 
strain (not transformed) and one arg B + transformant with no detectable human IF a-2 DNA showed no detectable 
synthesis of IF a-2 protein. In a separate experiment, transformation of A. niger , rather than A. nidulans, with pGL2BIFN 
using, mutatis mutandis, the same procedure as described above, demonstrated the ability of A. niger to secrete IF a-2. 

Thus, although the promoter and signal regions of pGL2BIFN are derived from A. niger they are shown to be 
operative in both A. nidulans and A. niger. 

In the present invention, use may be made of promoter regions other than the glucoamylase promoter region. 
Suitable for use are the promoter regions of the alcohol dehydrogenase I gene and the aldehyde dehydrogenase gene, 
illustrated in Figures 1 A and 1B. 

Example 6 - Construction of Plasmid pALCAIS. ATCC 53368 an intermediate vector 

For use with the present example, the alcA promoter was employed as comprised within an 10.3 kb plasmid pDG6 
deposited with ATCC within host E. Coli JM83 under accession number 53169. A plasmid map of pDG6 is shown in 
Figure 2 and, for ease of reference, in Figure 9 to which reference is now made, to illustrate another embodiment using 
the alcA promoter. 

pDG6 comprises, in its Hind lll-EcoRI (first occurrence) segment, the promoter region 22 of the ateA gene as well 
as a small 5' portion of the alcA coding region 3' of the start codon, ligated to the endoglucanase coding region 30. 
pDG6 further comprises a multiple cloning site 62 downstream of the C. fjmi endoglucanase coding region 20. 

To retrieve the alcA promoter region 22, pDG6 was cut with Pst I and Xho I removing the bulk of the endoglucanase 
coding region 30. In a second step, the linearized plasmid 64 was resected in one direction in a controlled manner with 
exonuclease III (which will resect from Xhol but not Pstl<ut DNA ends) followed by tailoring with nuclease S1. The 
resection was timed so that the enzyme removed nucleotides to a position 50 bases 5' of the ateA ATG codon, leaving 
the TATA box and messenger RNA start site intact. 

Following resection, the vector 66 was religated (recircularized) creating vector 68 bearing Sal I-Xba I restriction 
sites immediately downstream of the promoter region 22. Cleavage of vector 68 with Sal l/Xba I permits introduction 
of a signal peptide coding region at an appropriate location within the vector. 

The particular signal peptide coding region employed in the present example was synthesized to reproduce a 
characteristic signal peptide coding region identified according to standard procedures as described by G. von Heijne 
in Eur J. Biochem. 17-21, (1983). The synthetic signal was engineered so as to provide a 5' flanking sequence com- 
plementary to a Sal I cleavage site and a 3' flanking sequence enabling ligation with the Xba I restriction sequence. 
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The sequence of the synthetic secretion signal 68 is reproduced below: 

Sal X 



TCGACATGTACCGGTTCCTCGCCGTCATCTCGGCCTTCCTCGCCACTGCCTTCGCCAAG 
+ — +• + + 

GTACATGGCCAAGGAGCGGCAGTAGAGCCGGAAGGAGCGGTGACGGAAGCGGTTC 
MetTycArgPheLeuAlaVallleSecAlaPheLeuAlaThrAlaPheAlaLys 



Xba I 
T 

60 64 

AGATC 

SerArg 



The secretion signal per se begins with Met and ends with the fourth occurrence of Ala, as indicated by the arrow 
Once generated, the synthetic sequence 68 acting as signal is cloned into the Sal l-Xba I site of vector 70 resulting 
in plasmid pALCAIS which contains alcA promoter region 22, and synthetic peptide signal coding region 68. That the 
signal peptide coding region is inserted upstream of the multiple cloning site 62 is significant in that the site 62 allows 
25 for cloning of a variety of protein coding segments within this plasmid. 

Accordingly, pALCAIS constitutes a valuaDle embodiment of the present invention. 

Example 7 - Construction of Plasmid pALCAISIFN 



As an example of the utility of pALCAIS, reference is made to Figure 10 showing creation of pALCAISIFN. This 
plasmid comprises the promoter region' 22 of the alcA gene and the synthetic signal peptide coding region 68 both of 
which are derived from pALCAIS (Figure 9). In addition, it contains the coding region 60 coding for human interferon 
a-2 derived from pG L2BI FN. 

Toobtain the protein encoding segment, pGL2BIFN is cleaved with Eco Rl and partially cleaved with Bgl II (because 
of the presence of internal Bgl II sites). Insertion of the protein coding region is accomplished by cleaving pALCAIS 
with Bam HI and Eco Rl both of which are available in the multiple cloning site 62 and ligating this coding region therein 
thereby creating pALCAISIFN. 

The nucleotide sequence of the resultant plasmid, from a site 1170 nucleotides downstream of Hind IN ta Eco Rl 
is shown in Figure 11, indicating the relevant sites of restriction endonuclease digestion. It will be noted from sheet 3 
of Figure 1 1 that the IF a-2 coding region 60 is in proper reading frame with the synthetic signal peptide coding region 68. 

Example 8 - Expression and S ecretion from A. Nidulans Transformed with Plasmid ALCAISIFN 

The plasmid pALCAISIFN prepared as described above was co-transformed with A, nidulans to provide an arq B 
selectable marker, theargB* transformants selected and checked for the presence of the human interferon a-2 coding 
region, then grown on a threonine-containing medium to induce the alcA promoter, all as described in example 3 above. 
The extracellular medium was assayed for human IF-2 using Cell Tech IFa2 assay kit. Eleven of twenty transformants 
showed secretion of interferon, induced in the presence of threonine, and repressed in the presence of glucose. 

$0 Example 9 - pGL2CENDO ATCC 53372 

In accordance with the procedures described in the previous examples, there was constructed a vector plasmid 
designated pGL2CENDO, from plasmid pGL2C ATCC 53367, analagous to pGL2BIFN shown in Fig. 7, but containing 
the endoglucanase coding region in place of the interferon 2 coding region, and using the synthetic linker sequence 
55 B C* (Fig. 5) in place of linker sequence "B". A Bam HI fragment containing the C, fimi endogluconase coding region 30 
was inserted into the Bgl II site of pGL2C. A. nidulans transformants were prepared with this vector plasmid, and 
showed starch regulated secretion of cellulase assayed as described in Example 1. The map of vector plasmid 
pGL2CENDO is shown in Fig. 1 2 of the accompanying drawings, in which 30 denotes the endoglucanase coding region 
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(the endoglucanase coding region of CeHulomonasfimi , described in connection with Fig. 2 and Example 1 ), 42 denotes 
the signal peptide coding region of the glucoamylase gene and 40 denotes the promoter region of the glucoamylase 
gene. The nucleotide sequence is shown in Figure 13 and exemplifies that use of linker sequence C (Fig. 5) retains 
the reading frame of the signal peptide coding region 42 and the endoglucanase coding region 30. 

5 

Example 10 - Construction of Plasmid pALCAlSENDO ATCC 53370 

In accordance with the procedures described in the previous examples, there was constructed a vector plasmid 
designated pALCAlSENDO by combining Eco Rl - linearized plasmid pALCAIS as described in example 5 (Fig. 9) with 
10 an Eco Rl fragment derived from plasmid pDG5B (see Fig. 2) (pDG5 with the orientation of the Hind III fragment 
reversed in pUC12) and containing the endoglucanase coding region 30. The map of pALCAlSENDO is shown in 
Figure 14 and the nucleotide sequence of its pertinent region is shown in Figure 15. In these figures, the promoter 
region derived from alcA is designated by numeral 22, the synthetic signal peptide coding region is designated 68 and 
the endoglucanase coding region is designated by reference numeral 30. 

15 

Example 11 - Expression and Secretion from A. nidulans Transformed with pALCAlSENDO and PGL2CENDO 

A. nidulans was co-transformed with an argB + selectable marker and the plasmid pALCAlSENDO or pGL2CENDO 
prepared as described above. Of the co-transformants obtained several showed varying levels of secretion of cellulase 
20 (j.e. endoglucanase) as assayed on carboxymethylcellulose plates and the monoclonal antibody test systems as de- 
scribed in example 1. Both plasmid transformants showed secretion which was controlled by the linked promoter. 
Plasmid pGL2CENDO was induced by starch and pALCAlSENDO was induced with threonine. 

Example 12 - Expression and Secretion From A. niger Transformed with pGL2CENDO 

25 " ~ " * ~" "~ 

A. niger was cotransformed with an argB + selectable marker and the plasmid pGL2CENDO. Several of the trans- 
formants showed varying levels of secretion of endoglucanase as assayed as described in example 1 . This secretion 
was induced by the presence of starch in the medium. 

30 Example 13 - Increased Copy Number of Regulatory Genes 

In Aspergillus nidulans the aJcA promoter is turned on in the presence of the appropriate inducer, such as ethanol, 
by the action of the gene product of alcR, the positive regulatory gene for alcA. 

Evidence with multiple copy transformants (containing multiple alcA promoters) suggests that the alcR gene prod- 
35 uct limits the promoter function of the several alcA promoters requiring stimulation. 

Increasing the copy number of the alcR gene increases the expression of alcR and relieves this situation. The 
evidence for this is as follows: 

Transformants with multiple copies of the alcA promoter fused to its own coding region (ADH I) in a multiple alcR 
background (which has been shown to overproduce alcR messenger RNA) do not grow well on ethanol. This is probably 
40 due to rapid accumulation of aldehydes/the product of ADH breakdown of ethanol. ADH activity in these strains is 
high. The increased activity of ADH due to increased copy number probably accounts for these observations. 

Transformants with multiple copies of the alcA promoter fused to interferon a-2 in a multiple alcR background 
produce significantly higher levels of secreted interferon. In these strains, unlike those with single copy alcR, many 
more of the alcA promoters have access to the alcR regulatory protein. 
45 Thus, preferred embodiments of the present invention provide means for introducing a coding region into a fila- 

mentous fungus host which, when transformed, will secrete the desired protein. Particularly useful intermediate plas- 
mids for this purpose are pALCAIS and pGL2 (A, B or C). 

Useful transformation vectors created from these plasmids include pALCAlSIFN, pGL2BIFN, pALCAlSENDO and 
pGL2CENDO. Cultures of each of these and other plasmids mentioned herein are currently maintained in a permanently 
so viable state at the laboratories of Allelix Inc., 6850 Goreway Drive, Mississauga, Ontario, Canada. The plasmids will 
be maintained in this condition throughout the pendency of this patent application and, during that time, will be made 
available to authorized persons. After issue of a patent on this application, these plasmids will be available from the 
ATCC depository recognized under the Budapest Treaty, without restriction. The accession numbers of the respective 
deposits appear in the table below: 
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Plasmid 


Host 


Accession # 


Deposit Date 


pDG6 


F Coli WAR** 


oo i oy 


June /, lyoo 


pGL2A 




OooOD 


uec. lb, lyoo 


pGL2B 


m 


OooDD 


■ 


pGL2C ' 




53367 


■ 


pALCAIS 




53368 


■ 


pALCAISENDO 




53370 


a 


pALCAISIFN 




53369 


• 


pGL2BIFN 




53371 


■ 


pGL2CENDO 




53372 


■ 


pALCAIAMY 




53380 


Dec. 20, 1985 



Claims 



1 . A method for producing a polypeptide which comprises: 

20 

(i) culturing a filamentous fungus which has been transformed by a recombinant DNA construct in which a 
regulatable promoter region of a filamentous fungus gene is operably linked to a DNA fragment encoding the 
polypeptide, wherein: 

25 (a) said DNA fragment is foreign to the promoter region; and 

(b) said regulatable promoter region is inducible by ethanol said ethanol inducibility being mediated by 
the alcR gene product; 

(ii) under conditions in which regulated expression of the polypeptide occurs; and 
30 (iii) recovering said polypeptide. 

2. A method according to claim 1 wherein the promoter region is repressible by glucose. 

3. A method according to claim 2 which comprises culturing the filamentous fungus in the presence of glucose during 
35 the cell growth phase and adding an inducer in the absence of glucose to enhance the activity of the promoter 

region. 

4. A method according to any one of the preceding claims wherein the promoter region is derived from an alcohol 
dehydrogenase gene or an aldehyde dehydrogenase gene of a filamentous fungus. 

40 

5. A method according to any one of the preceding claims wherein the DNA construct containing the regulatable 
promoter region is present in multiple copies. 

6. A method according to claim 5 wherein the filamentous fungal strain expresses greater than normal levels ofalcR, 
45 thereby increasing the level of expression of the multiple copies of the DNA construct containing the regulatable 

promoter region. 

7. A method according to any one of claims 1 to 6, wherein the regulatable promoter region is derived from an As- 
pergillus gene. 

50 

8. A method according to claim 7, wherein the regulatable promoter region is derived from an Aspergillus nidulans 
gene. 

9. A method according to claim 7, wherein the regulatable promoter region is derived from an Aspergillus niger gene. 

55 

10. A method according to any one of the preceding claims wherein the filamentous fungus is an Aspergillus . 

11. A method according to claim 10 wherein said Aspergillus is A.nidulans. 
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12. A method according to claim 10 wherein said Aspergillus is A niqer . 

13. A filamentous fungus as defined in any one of claims 1. 2 or 4 to 12. 

14. A recombinant DNA construct comprising a regulatable promoter region of a filamentous fungus gene operably 
linked to a DNA fragment encoding the polypeptide, wherein: 

(a) said DNA fragment is foreign to the promoter region; and 

(b) said regulatable promoter region is inducible by ethanol said ethanol inducibility being mediated by the 
alc R gene product. 

15. A recombinant DNA construct according to claim 14 which is as further defined in any one of claims 2 or 4 to 12. 



i$ Patentanspruche 

I . Verfahren zur Harstellung eines Polypeptids, umfassend: 

(i) das Zuchten eines filamentosen Pilzes, der durch ein rekombinantes DNA-Konstrukt transformiert ist, wobei 
20 in dem Konstrukt eine regulierbare Promotorregion eines filamentosen Filzgens funktionell mit einem fur das 

Polypeptid kodierenden DNA-Fragment verknupft ist. wobei: 

(a) das DNA-Fragment fur die Promotorregion fremd ist; und 

(b) die regulierbare Promotorregion durch Ethanol induzierbar ist, wobei die Ethanol-lnduzierbarkeit durch 
25 das alcR-Genprodukt vermittelt wird; 

(ii) unter Bedingungen, bei denen die regulierte Expression des Polypeptids erfolgt; und 

(iii) das Gewinnen des Polypeptids. 

30 2. Verfahren nach Anspruch 1 , wobei die Promotorregion durch Glucose reprimierbar ist. 

3. Verfahren nach Anspruch 2, umfassend das Zuchten des filamentosen Pilzes in Gegenwart von Glucose wahrend 
der Zellwachstumsphase und das Zusetzen eines Inducers in Abwesenheit von Glucose, urn die Aktivrtat der 
Promotorregion zu verstarken. 

35 

4. Verfahren nach einem der vorstehenden Anspruche, wobei die Promotorregion von einem AlkohoWehydrogena- 
se-Gen oder einem Aldehyddehydrogenase-Gen eines filamentosen Pilzes abgeleitet ist. 

5. Verfahren nach einem der vorstehenden Anspruche, wobei das DNA-Konstrukt, das die regulierbare Promotorre- 
40 gion enthalt, in mehrfachen Kopien vorhanden ist. 

6. Verfahren nach Anspruch 5, wobei der tllamentose Pilzstamm das alcR uber das normale Niveau hinaus exprimiert, 
wodurch das Expressionsniveau der mehrfachen Kopien des DNA-Konstrukts, das die regulierbare Promotorre- 
gion enthalt, erhoht wird. 

45 

7. Verfahren nach einem der Anspruche 1 bis 6, wobei die regulierbare Promotorregion von einem Aspergillus-Gen 
abgeleitet ist. 

8. Verfahren nach Anspruch 7, wobei die regulierbare Promotorregion von einem Aspergillus nidulans-Gen abgeleitet 
*> ist. 

9. Verfahren nach Anspruch 7, wobei die regulierbare Promotorregion von einem Aspergillus niger-Gen abgeleitet ist. 

10. Verfahren nach einem der vorstehenden Anspruche, wobei as sich beim filamentosen Pilz urn einen Aspergillus- 
55 Pilzhandelt. 

II. Verfahren nach Anspruch 10, wobei es sich beim Aspergillus-Pilz urn A. nidulans handelt. 
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12. Verfahren nach Anspruch 10, wobei es sich beim Aspergillus-Pilz um A. niger handelt. 

13. Filamentoser Pilz gemaG der Definition in einem der Anspruche 1 , 2 oder 6 bis 1 2. 

5 14. Rekombinantes DNA-Konstrukt. umfassend eine regulierbare Promotorregion eines filamentosen Pilzgens, das 
operativ mit einem fur das Pofypeptid kodierenden DNA-Fragment verknupft ist, wobei: 

(a) das DNA-Fragment fur die Promotorregion fremd ist; und. 

(b) die regulierbare Promotorregion durch Ethanol induzierbar ist, wobei die Ethanol-lnduzierbarkeit durch das 
10 alcR-Genprodukt vermittelt wird; 

15. Rekombinantes DNA-Konstrukt nach Anspruch 14, das zusatzlich der Definition in einem der Anspruche 2 oder 
4 bis 12 entspricht. 

75 

Revendicatlons 

1 . Precede de production d'un polypeptide, comprenant : 

20 (i) la culture d'un champignon filamenteux qui a ete transforme par une construction d'ADN recombinant, dans 

laquelle un promoteur regulable d'un gene de champignon filamenteux est lie de facon operante a un fragment 
d'ADN codant pour le polypeptide, dans laquelle : 

(a) ledit fragment d'ADN est Stranger au promoteur, et 
25 (b) ledit promoteur regulable peut dtre induit par I'ethanol, ladite inductibilite par I'ethanol 6tant mecltee 

par le produit du gene alcR; 

(ii) des conditions dans lesquelles it se produit une expression regulee du polypeptide, et 

(iii) la recuperation dudit polypeptide. 



30 



2. Procede suivant la revendication 1 , dans lequel le promoteur peut etre reprimd par le glucose. 



3. ProcedS suivant la revendication 2, comprenant le culture du champignon filamenteux en presence de glucose 
durant la phase de croissance de la cellule et I'addition d'un inducteur en Pabsence de glucose pour accroTtre 

55 I'activite du promoteur. 

4. Procecte selon I'une quelconque des revendications precectentes, dans lequel le promoteur est d6riv§ d'un gene 
de I'alcool deshydrogenase ou d'un gene de I'aldehyde dehydrogenase d'un champignon filamenteux. 

40 5. ProcedS selon I'une quelconque des revendications precectentes, dans lequel la construction d'ADN contenant le 
promoteur regulable est presente en plusieurs copies. 

6. Procede suivant la revendication 5, dans lequel la souche de champignon filamenteux exprime des niveaux d'alcR 
superieurs a la normale, augmentant ainsi le niveau d'expression des copies multiples de la construction d'ADN 

4 $ contenant le promoteur regulable. 

7. Procede selon I'une quelconque des revendications 1 a 6, dans lequel le promoteur regulable est derive d'un gene 
d' Aspergillus. 

50 8. Proc&te suivant la revendication 7, dans lequel le promoteur regulable est d6riv6 d'un gene d'Aspergillus nidulans. 

9. Proc6d6 suivant la revendication 7, dans lequel le promoteur regulable est derh/6 d'un gene d' Aspergillus niger. 

10. Procedd selon I'une quelconque des revendications precedentes, dans lequel le champignon filamenteux est un 
55 Aspergillus. 

11. Proceed suivant la revendication 10, dans lequel ledit Aspergillus est A. nidulans. 
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12. Procede suivant la revendication 10, dans lequel ledit Aspergillus est A. niger . 

13. Champignon filamenteux comme defini dans Tune quelconque des revendications 1, 2 ou 4 a 12. 

5 14. Construction d'ADN recombinant, comprenant un promoteur regulable d'un gene de champignon filamenteux lie 
de facon operante a un fragment d'ADN codant pour le polypeptide, dans laquelle : 

(a) ledit fragment d'ADN est etranger au promoteur, et 

(b) ledit promoteur regulable peut etre indutt par l'6thanol, ladite inductibilite par l'6thanol etant mediee par le 
10 produit du gene alcR. 

15. Construction d'ADN recombinant suivant la revendication 14, qui est comme defini plus avant dans Tune quelcon- 
que des revendications 2 ou 4 a 12. 

15 



20 



25 



30 



35 



40 



45 



50 



55 
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1 

GG AT AC AGTTGGGCATTTCT AGGGCTG A ATGGG A AGG AG AG AG TTTTGAAATAGGCGTTC 
+ + + + + ' + 



CGTTCTGCTTAGGGTATTTGGGAAC AATCAATGTTCAATGTACATTTAATCC ACG ATTTT 
+ + + + + + 



ATAAAACGTCATCCTTTGCCCTCCCTTCTTATTTGCCAATACC AAAAATCTTACTCC AGT 

CGTTCGGTAATCGCAGAGTTAAATCTGGGCTCGGTGGCAGATCTCCGATCGTCCATAACC 
+ + i. + + + ; + 



GTTC AG ATGTTGATTGGAAC TGGGTGGGG TAG AC AGCTCCG A AG AC CGAGTGAACG TATA 
+ + + + + ^ 



CCTAAGACACTTTGACACGGCCGGAACACTGTAAGTCCCTTCGTATTTCTCCGCCTGTGT 
+ + + + + 



GGAGCTACC ATCCAATAACCCCCAGCTGAAAAAGCTGATTGTCG ATAGTTGTGATAGTTC 
+ + + + + + 



CC ACTTGTCCGTCCGCATCGGCATCCGC AGCTCGGGATAGTTCCGACCTAGG ATTGGATG 

CATGCGGAACCGC ACGAGGGCGGGGCGGAAATTGACACACC ACTCCTCTCCACGC AGCCG 
+ + + + + + 



TTCAAGAGGTACGCGTATAGAGCCGTATAGAGCAGAC ACGG AGCACTTTCTGGTACTGTC 
+ + + + + +. 



CGC ACGGG ATGTCCGC ACGGAGAGCC ACAAACG AGCGGGGCCCCGTACGTGCTCTCCTAC 

CCC AGGATCGC ATCCTCGCAT AGCTGAACATCTATATAAAGACCCCC AAGGTTCTC AGTC 
+ + + + + -h 



TCACC AACATCATCA ACC AAC AATC AACAGTTCTCTACTC AGTTAATTAGAACACTTCC A 

ATCCTATC ACCTCGCCTCAAAATGTGCATCCCCACTATGCAATGGGCCC AGGTCGCCGAG 1/ 

MetCysIleProThrHetGlnTrpAiaGlnValAlaGlu I 

Fl G.I A sheet 1 m i 
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8L] 

AAGGTCGGCGGCCCGCTCGTCT AC AAGC AC ATCCCCG TCCCT AAGCC CGGTCCCG ACC AG t 
+ + + + + + J 

LysValGlyGlyPr oLeu ValTy rLysGln lie Pro Val ProLysProCly ProAspCln 

ATCCTTGTGAAG ATC CGCTACTCTGGGGTTTGCC ACACCGACC TACACGCTATC ATCCGT 
+ + + + + + 

Xleteu ValLys IleArgTy rSerCly ValCysHlsThr Asp Leu HI s Ala Me c Me t Cly 



CACTGGCC AATCCCCGTCAAAATGCCCCTCGTCGGTGCGCACG AAGG AGCAGG AATCGTC 
+ + + + + — ► 

HisTrpProlleProValLysMetProLeuValGlyGlyHisGluClyAlaGly IleVal 

GTGGC AAAGGGCG AACTGGTCC ACG AATTCC ACATCGGCG ACCA AGC TGGC ATC AA ATGG 

+ + - - -+ + + . + 

ValAlaLysClyGluLeuValHisGluPheGluIleGlyAspGlnAlaGlylleLysTrp 



CTTAATCGTTCCTCCGCACAGTGCG AGTTCTGCCGCCAATCCGACGACCCCCTCTGTCC A 

— — + + + -«+ + 

LeuAsnGlySerCysGlyCluCysGluPheCys ArgGlnSer AspAspProLeuCys Ala 



10 



CGCGCCC AGC TCTCTGGGTATACTGTTGACGCC ACG TTCC AGC ACT ATCCGCTCGG AAAG 
+ + + + + + 

ArgAlaGlnLeuSerClyTyrThrVal AspGlyThr PheGlnClnTyr Ala LeuGly Lys 

CCC AGTC ATCCCTCC A AC ATCCCTC CGGCCGTTCCGGTGG ATCCCGCGCCCCC ACT ACTC 

+ > + + + 

AlaSerHl sAlaSerLys IleProAlaGly ValProVal AspAlaAlaAlaPro ValLeu 

TCTGCCGGTATTACAGTGTACAACCGATTGAAAG AGGCCGGGGTCCGGCCGGGCCAG ACC 
- + + + + + + 

CysAlaGlylleThrValTyrLysGlyLeuLysCluAlaGlyValArgProGlyGlnThr 

GTGCCCATCCTGGGTGCCCGTCCCGGCCTGGCATCCCTTGCACAGCAGTATGCG AAGCCC 
+ + + ♦ + + 

Val Ala I ie Va 1G1 yAl aClyGlyGly LeuGly Se r LeuAlaGlnClnTyr Ala Lys Ala 

ATCCGGATC AGCGTTC TCG C GGTCG ATGG GGG AC ATG AC A ACC GGCCCATGTG TG AC TCG 

+ + + +- + + 

MetGlylleArgValValAlaValAspClyClyAspGluLysArgAlaMetCysGluSer 

CTTGGAAC AGAGGTATGTACATGTTCTCAAXCTC AGGAAGC AAGC AACTGACCTGG ACAG 
+ + + + + h 

LeuClyThrClu IVS 1 

ACATATCTCGACTTCAC C A AATCT AAAG ACC TCG TGGC CG ACG TCAGGCACGG ACG C CCA 
+ + + + + -+ 

ThrTyrValAspPheThrLysSerLysAspLeuValAlaAspValArgHisGlyArgGly 

FIG 1A sheet 2 1560 : 
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1561 

TGTCTCGGTGCG CACGCGGTGATCCTGCTTGCCGTGTCAG AG AAGCCCTTCC AGC AAGCC 

+ — - + . ♦ + ! 

CysLeuGlyAlaHis Al a Val lie Leu Leu Ala Val Se rGlu Lys Pro PheCl nC 1 n Ala 

ACTGAATATGTGCGC tctcgcgccacaattcttgctattggcttgccccc AGATGCGTAC 

+ 4. + + + + 

ThrGluTyrValArgSer ArgGlyThr He ValAlalleClyLeuProPro Asp AlaTy r 

CTC AAGGCCCCTCTG ATC AAC ACAGTTGTTCGCATGATC ACTATC AACCCCAGCTACGTT 

+ + + - + + 

LeuLysAlaProVal He AsnThr Val Va 1 ArgMe t He Thr He ly s Cly Se r Ty r Va 1 

GGAAACCCACAGCACGCTGTCCACCCTCTGGACTTCTTCGCTCCCGCCCTCATCAAGGCT 

+ + M + > — + + 

GlyAsnArgClnAspGlyValCiuAlaLeuAapPhePheAlaArgGlyLeuIleLysAla 

CCCTTCAAG ACGGCTCCTCTG AAGC ATCTGCCGAACATTTACG ACCTTATGCGTGCGTTG 

. 4. 4. „--+ + + 



ProPheLysThrAlaProLeuLysAspLeiiProLysIleTyrCluLeuMet 

ACTCCCATATCCGATCTTC AATTCTCTTTCCG.CCATATATTTAGATACTAATCGCTTCCA 
+ + + + + 



aVS 2. 



C AAC AAGCC AGAATCCCCCCTCCTTATCTGCTAGAG ATGCCAGAATAAGCGTTTCA ACGC 

+ + + + + + 

GluGlnClyArglleAlaGlyArgTyrValLeuGluMe tProGluEnd 

CC ACGCGCTGGAACTACAAAC ACAATCGTCAGATGTTTCATGTTTATGATGTCC ATGCTT 
. _ . + + + + 



h 

10 



G ATATCTTTCTATAT AGTTTTC AATCAAGTGGTACAATGATTTTGGCCTTGGTTC AACCG 



ATTTCCCTTCC ACTTTTCCTAGCCTGATACGGATAGC ACTTGTAAGCAATCATAAACCAC 

4. . + + 



AG ACTGG ATAG ACTGCG AAGTATAGGTATTATGGT AGCCAAATGCG ATGATTTG ACTTTC 

4. + + 



AATCAAGCCTCAAACTAGCCCTCACCGTCTCTTTTTCCGGTTATTTTTAGTCGATTCACG 
" * " + + — 228Q 

FIG.1A sheet 3 
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GGATCC ATTTTCCAGCATGTCG ACC AAACTGCAAATAC AAGTGTACG AAGG ACGGCGTAT 
+ + + + + ^ 

AGTAACGG A ACC ACTCCC ACCC AAGC AACCG AG AATG ACGTCTC AGACTCTGCGAGTGAG 
+ + + +• ♦ + 

CCGGGCTCC AATC ACGG AACTTCTCCATGGTCATCAACCCCGCATG ATCTTCTCATC ACG 
+ + + + + + 

CCTCTTGGTTCGTAATTTTCATTTTTGC ATTACGGCC TCGGTTATC ATCGC AGCC TCCAC 
+ + + + ^ 

CAC ATAGTCGTCAAGATAGGTCCAG AATCAGTCCGCTCTAGGGGGGT AAATCGTAAATTG 
+ + + ^ + + 

CAATTCGCATTACGGTCTGGGTTATCG ATCGCGGGGATCCTCAACTTTGTTTCAGAACCA 
- - +- + + --+- + + 

GG GTGCXGT AG GTTGT AG ATCGTAAGTTTCATCCTGCATTACCCGCCTCGGTTATTATCG 

CGAGCTCTTCAACGTGTTTTCAGAATCATCTAGGCTCGTGGAGGCAGTCGGCACCGCGGC 
+ + + -+ h + 

GAAGGCGACGGAATCC AGTTCACCTGGACTGGCTCTTG AAG ACCAGTGGGGCACTTCGGC 
'■ + + + + + + 

GGGTTGCT AGCTTGCTACATGTAATTTCCATGGGTAACAGCTATCCTC AACAAGAGCGGC 

*♦* + + + 

TCCGCTTGACC TGTTCCCCTCCTTTCCCCTCTTTTCCTGCGACC ACTGGCTC AGTGCT AC 
CAAAGCCAGAGCGGTATTATTAAACCTCCCTCGTCCTCCCACCCAGCCAGCATTTCTCCC 

........... 

TACTCCAACTCTCCTCTCCCAAGATACCCATATTTCCCGCTCACC ATGTCTGATTTGTTC 



+ + 

Me tSer AspLeuPhe 



ACC ACC ATCGAGACTCCGGTCATCAAATATGAGC AGCCTCTCGGCCTGTATGACGTTTTC 
ThrThrlleGluThrProVallleLysTyrGluGlnProLeuGlyLeu 

FIG. IB sheet 1 840 
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8/>1 

TCGCCTCCTC ATTTTTTTTGTGTTGTGTTTATTAACG ATC ATTGGGTTGTAGGTTC ATC A 

+ + + + + +■ 

• IVS 1 PhelleA 



AC AACGAGTTCGTGAAGGCCGTTGAGGGCAAGACCTTCCAGGTCATC AACCCCTCC AACG 
+ +. + + + + 

snAsnGluPhe Val LysGly ValGluGly LysThr PheGln Val He AsnProSer AsnG 



AGAAGGTCATC ACCTCCGTCCACGAAGCCACCGAGAAGGATGTTGATGTCGCCGTCGCTG 
+ + + + + 

luLysVal IleThrSerVal HisGlu AlaThrGlu Ly sAsp Val Asp Val Al a Val Ala A 



C TCCCCGTGCTGCCTTTGAgGCGCCATGGCGCCAGGTC ACCCCCTCTG AGCGTGGCATTT 

+ — + + + + + 

laAlaArgAlaAlaPheGluGlyProTrpArgGlnValThrProSerGluArgGlylleL 



TGATCAACAAGCTGGCGGATCTGATGGAGCGTCATATCCACACCCTCGCCGCTATCGAGT 
+. + + + + + 

eulleAsnLysLeuAlaAspLeuMetGluArgAspIle AspThr LeuAlaAla lie Glu S 

CTCTCG ACAACGGC AAGGC TTTCACC AXGGCCAAGGTCG ATCTTGCC AACTCc ATTGGTT 

+ — + + + + + 

erLeuAspAsnGlyLys AlaPheThrHe t AlaLys Val AspLeu AlaAsnSe rl leGlyC 

CCTTGCG AT ACT AC GCTGGCTGGGC GG AC AAG AT TCACCGTC AG ACCATTG AC ACC AACC 
+ + + + + + 

ysLeuArgTyrTyrAlaGlyTrpAla AspLys IleHlsGlyGlnThr He AspThr As tiP 

CCGAGACTCTTACCTAC ACCCGCCACG ACCCCGTTGGTGTTTGCGGTC AGATC ATCCCCT 
+ + + + „ + + 

roGluThrLeuThrTyrThrArgHisGluProValGlyValCysGlyGlnllelleProT 

CG AACTTCCCCCTTCTGATGTGGTCCTGG AAGATTGG ACCCGCTGTTGCCGCTGGTAAC A 
+ + + + + + 

rpAsnPheProLeuLeuMe t Tr p Se r Tr pLy s II eGly ProAla Val Al aAlaGly As nT 



CTGTTGTCCTC AAGACCGCCCAGC AGACCCCTCTCTCCGCCCTTTACGCTGCT AAGCTGA 
+ + + + + 

hr Val Val LeuLysThrAl aGlnGlnThr Pro Leu Ser AlaLeuTy r Ala AlaLys Leu I 



TC AAGGAGGCTCC At CCCCCGCTGGTGTGATC AACGTCATCTCTGGCTTTGGCCGTACCG 
+ + + + + + 

leLysGluAlaProPheProAlaGlyVallleAsnVallleSerGlyPheGlyArgThrA 



CTGGTGCTGCCATCTCCAGCCACATGGAC ATTGACAAGGTTGCCTTC AC TGGCTCTACTC 
+ + + + + + 

laGlyAlaAlalleSerSerHisMetAspIleAspLysValAlaPheThrGly SerThrL 

FIG. IB sheet 2 " 1560 ; 
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1561 . 

TTGTTGGACCTACC AcCCTGC AGGCCCCTGCTAAGAGC AACCTGAAGAAGGTCAC TCTTG 
+ + + — + 

euValClyProThrll e LeuG In Al a Al aAl aLy s Ser AsnLeuLysLysValThr LeuC 

AGCTCGGTCGCA AGTCTCCC AAC ATCGTCTTTG ATCATGCTG ACATTGAC AACGCCATTT 

+ + + +. + 

luLeuGlyGlyLysSerPf oAsnlleValPheAspAspAlaAspIle Asp Asn Al a 11 e S 

CCTGGGCC AACTTTGGTATCTTCTTCAACCACGGCC AGTGCTGCTGTGCTGGATCCCGTA 

+ + + + + + 

erTrpAlaAsnPheGlyllePhePheAsnHisGlyGlnCysCysCyaAlaGlySerArgl 

TCCTGGTCCAGG AGGGC ATCTACGAC AAGTTCGTCGCCCGCTTC AAGGAGCGTGCCCAG A 

+ + + + + + 

leLeuValGlnGluGlylleTyrAspLysPheValAlaArgPheLysGluArgAlaClnL 

AGAAC AAGGTCGG AAACCCCTTCG AGCAGGACACCTTCC ACGGTCCCCAGGTTTCCC AGC 

+ + + + + + 

ysAsnLysValGlyAsnProPheGluGlnAspThrPheGlnGlyProGlnValSerGlnL 

TCC AGTTCG ACCGTATC ATGGAGTAC ATC AACC ACGGC AAGAAGGCTGGTGCTACCGTCG 

_ + + + + + + 

€uClnPheAspArgIleMetGluTyrIleAsnHisGlyLysLysAlaGlyAlaThrValA 

CC ACCGGTGGTGACCGCC ACGGCAACGAGGGTTACTTC ATCC AGCCTACTGTCTTC ACAG 

. __ + + + „ + + 

laThrGlyGlyAspArgHisGlyAsnGluGlyTyrPhelleGlnProThrValPheThrA 

ACGTC ACTTCCGACATGA AGATTGCCCAGGAGG AGATCTTCGGTCCTGTCGTCACTATCC 

+ + + + + + 

spValThrSerAspMecLysIleAlaGlnGluGluIlePheClyProValValthrlleG 

AGAAGTTCAAGGATGTGGCTGAGGCTATCAAGATCGGCAACTCG ACCGACT ACGGTACGT 

„,,. 4. + + + 



InLysPheLysAspVal AlaGluAl alleLys IleGlyAs n Se rThr AspTy r 

CTATCTTTTCTGGTCTTTGCCCAXATTTTGTTCCTAACATACGCACAGGTCTTGCTGCTG 
XVS 2 .GlyLeu AlaAlaA 



CCGTGCAC ACAAAG AACGTCAAC ACCGCC ATTCGCGTGTCCAACGCTCTGAAGGCTGGT A 

+ + + + + + 

laValHlsThrLysAsnValAsnThrAlalleArgValSerAsnAlaLeuLysAlaGlyT 

CCGTCTGCATC AAC AACTACAAC ATG ATCTCGTACCAGGCTCCCTTCGGTGGCTTCAAGC 

+ + + + + + 

hrValTrpIleAsnAsnTyrAsnMecIleSerTyrGlnAlaProPheGlyGlyPheLysG 

FIG.1Bsheet 3 2280 ! 
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2281 

AGTCCGGTCTCGGCCGTGAGC TTGCCTCTTACCCTCTTG AGAACTACACAC AG ATC AAGA 
+ + + + + 

InSerGIyLeuGlyArgGlu LeuGly SerTyr AlaLeuClu Asn TyrThrGln He Lys T 



CGGTGC ACTACCGCCTGGGTG ATGCTCTTTTCGCTTAAAGCT AATTGTATGATTTGATGA 
+ + + + + + 

hrValHisTy r ArgLe uG ly As pAla Le u Phe Al aEnd 2L00 " 



10' 



AATTGCGAATAC AAGTTGG AT ATATCCTGTGTGCTACGGC ACTGGTTC AAATTGCTTCTT 
+ + + + + + 



GTGCAGC AACCATGTGACTC ATGTAAAAC AT ATCAGATAACCCCGGATACGATTTTACGA 
+ + + + + + 



TTTTTTAGATTTGCTTTTATCGTAGCGTCCACTTATCCTCGTCCGGCCAAGCAC AAAACC 
+ + + + + + 



TATGGCTATCTTCAGCACGCCGCG ATCCTG AAG CGTAGCTGGATTGG AAATCCG AAATC A 
+ + + + + + 



AC TGC C CCGC AGCC ACC G AC AC TCGGGCTCCGGGC AAG TCCCCGCGAA ATCCCTC AC C AC 
+ +. + + + + 

2700 
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Hind III 



12 



ATG EcoRl 

FIG. 3 
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/EcoRV 
1 Bgf Il/Xholl ( 

CGCGCAGATCTCGATATCGAGCT 
* GTCTAGAGCTATAGC *t^Sst I 
ArgAl aAspLeuAspIleGlu?? 



BgfllAhoII ^ EcoRV B 



CGCGCAAGATCTCGATATCGAGCT 



" GTTCTAGAGCTATAGC ^£ SstI 
ArgAl aAr gSe r Ar gTy r Ar gAla 



1 



Bgi>Il/XhoII / 



EcoRV 



CGCGCAAAGATCTCGATATCGAGCT 
GTTTCTAGAGCTATAGC 



25 



SstI 



ArgAl aLysIleSerlleSerSer? 



FIG. 5 
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1 EcoRl 

I— 

GAATTCAAGCT AGATGCTAAGCGATATTGCATGGC AATATGTGTTGATGCATGTGCTTCT 
+ + + + + 

TCCTTCAGCTTCCCCTCGTGCAG ATGAAGGTTTGGCTATAAATTGAAGTGGTTGGTCGGG 
+ + + + + + 



GTTCCGTGAGGGGCTGA AGXGCTTCCXCCCTXTTAGACGCAACTGAGAGCGTGAGCTTC A 
+ + + + + + 

TCCCC AGG ATCATTAC ACCTC AGCAATGTCG TTCCG ATCTCTACTCGCCCTG AGCGGCCT 
— ; + + + + + + 

Me t Se r Phe Arg Se r LeuLeu AlaLeu Se rGly Le 

BssHlI 
, L-n 

CGTCTGCACAGGGTTGGCAAATGTGATTTCC AAGCGCGCG 
+ + + -+ 

uValCysThrGlyLeuAlaAsnVallleSerLysArgAla 

280 



FIG.6 
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181 



40 



49 

4- 



U2 



TCCCCAGCATCATTACACCTCAGCAATGTCGTTCCGATCTCTACTCGCCCTGAGCGGCCT 
AGGGGTCGTAGTAATGTGG AGTCGTTACAGC AAGGCTAG AGATGAGCCGCACTCCCCGG A 

240 

Me cSer Phe Arg Se r Leu Leu Ala Leu Se rGlyLeu 



.241 



42 

£ 



52 44B' 



Ddefilled 



50 T ua / er " 

A>— i ^ 



cgtctccacagggttgccaaatgtgatttccaagcgcgcaagatctcgattcagctccaa ; 

GCAGACGTGTCCCAACCGTTTACAC TAAAGGTTCGCGCGTTCTAGAGCTAAGTCGACGTT 
V a lCysThrGlyLeuAlaAsnValIleSerLysArgAlaArgSerArgPheSerCy?Lys 



301 



60 



GTC AACCTGCTCTGTGGGCTGTG ATCTGCCTCAAACCC ACAGCCTGGGTAGCAGGAGGAC 

CAGTTCCACCAGACACCCG AC ACTAGACGG AGTTTGGGTGTCGGACCCATCGTCCTCCTG 

360 

SerSerCysSerValGlyCysAspLeuProClnThrHisSerLeuClySerArgArgThr 



361 
/ 



60 



CTTG ATGCTC 
+ 

CAACTACGAG 
LeuMetLeu 



EcoRl 
1_ 



/ v 

AAAACCGCATCATCGAGCTCC AATTC 
TTTTGGCCTACTAGCTCGAGCTTAAG 

1206 
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1200 

GGATACAGTTGGGC ATTTCTAGGGCTGAA 
+ + + 

CCTATGTC AACCCGTAAAGATCCCG ACTT 



TGGGAAGGAG AGAGTTTTG AAAT AGGCGTTCCGTTCTGCTTAGCGTATTTGGGAACAATC 

+ s-+ + + + + 

ACCCTTCCTCTCTC AAAACTTTATCCGCAAGGC AAG ACG AATCCC ATAAACCC TTGTTAG 



AATGTTCAATGTACATTTAATCCACG ATTTTATAAAACG TCATCCTTTGCCCTCCCTTCT 

+ + + + + + 

TXAC AAGTTAC ATGTAAATTAGGTGCTAAAAT ATTTTGC AGTAGGAAACGGGAGGGAAG A 



TATTTGCC AATACC AAAAATCTTACTCCAGTGGTTCGGTAAT CGC AGAGTTAAATCTGGG 

+ + + + + + 

ATAAACGGTTATGGTTTTTAG AATG AGGTCACC AAGCC ATTAGCGTCTC AATTTAGACCC 



CTCGGTGGC AGATCTGCG ATCGTCCATAACCGTTC AGATGTTGATTGG AACTGGGTGGGG 

+ + + + + + 

GAGCC ACCGTCTAGACGCTAGCAGGTATTGGCAAGTCTACAACTAACCTTGACCC ACCCC 



TAGACAGCTCCGAAG ACCG AGTGAACCTATACCTAAGAC ACTTTGAC ACGGCCGG AAC AC 

+ + + + + + 

ATCTGTCCACGCTTCTGGCTCACTTGCATATGGATTCTGTGAAACTGTGCCGGCCTTGTC 



TGTAAGTCCC TTCGTATTTCTCCGCCTGTCTGGAGCTACC ATCCAATAACCCCC AGCTGA 
+ + + + + + 

1560 



FIG. 11 sheet 1 
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AC ATTC AGGG A AGC ATAAAG AGGCGG AC AC ACCTCGATGGTAGGTTATTGGGGGTCG ACT 



1561 , . 

AAAAGCTG ATTGTCGATAGTTGTG ATAGTTCCCACTTGTCCGTCCGC ATCGGC ATCCGCA 
TTTTCGACX AAC AGCTATCAACACTATCAAGGGTG*AAC AGGC AGGCGTAGCCGTAGGCGT 



GCTCGGGATAGTTCCGACCTAGG ATTGGAT.GC ATGCGC AACCGCACg AGGGCGGGGCGGA 
" - -+-- + . — -+--- 

CGAGCCCTATCAAGGCTGGATCCT AACCTACGTACGCCTTGGCGTGcTCCCGCCCCGCCT 



AATTGACAC ACC ACTCCTCTCC ACGCAgCCGTTC AAGAGGTACGCGTATAG AGCCGTATA 
+ + + + + + 

TTAACTGTGTGGTGAGGACACGTGCGTcGGCAAGTTCTCC ATGCGC ATATCTCGGC ATAT 



GAGCAGAGACGGAGC AC TTTCTGGTACTGTCCGC ACGGG ATGTCCGCACGGAGAGCC AC A 
+ + + + + + 

CTCGTCTCTGCCTCGTG AAAGACCATGAC AGGCGTGCCCTACAGGCGTGCCTCTCGGTGT 



AACGAGCCCGCCCCCtfTACCTCCTCTCCTACCCCAGGATCGCATCCTCGCATAGCTGAAC 
" + + + + + + 

TTCCTCGCCCCGGGGCATGC ACG AGAGGATGCGGTCCTAGCGTAGG AGCGTATCG ACTTG 



ATCTATATAAAGACCCCCAAGGTTCTC AGTCTCACC AAC ATCATCAACC AACAATCAAC A 

" + ~+sr~ — — + * — 

FIG. 11 sheet 2 1920 
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TAGATAT ATTTCTGGGGCTTCC AAGAGTCAGAGTGCTTCTAGTAGTTGGTTGTTACTTGT 

Sal I 68 



1921 



GCGTCC ACATCTACCCGTTCCTCGCCGTCATC TCGGCCTTCCTCGCC ACTGCCTTCGCC A 
+ + + + + + 

CCC AGGTGT AC ATGGCC AAGGAGCGGC AGTAG AGCCGGAAGGAGCGCTGACGG AAGCGGT 

MecTyrArgPheLeuAlaVallleSerAlaPheLeuAlaThrAlaPhe AlaLys 

X bal BomHl /Bglll fusion 6 ^ 

AGTCTAG AGGATCTCG ATTCAGCTGCAAGTC AAGCTGCTCTGTGGGCTGTGATCTGCCTC 
+ + + + + + 

TCAG ATCTCCTAGAGCTAAGTCGACGTTCAGTTCGACG AGAC ACCCG AC ACTAGACGG AG 
SerArgGly SerArgPheSerCysLys SerSerCysSerValGlyCys AspLeuProCln 



AAACCC ACAGCCTGGGTAGCAGG AGGACCTTGATGC TCCTGGC ACAGATGAGGAG AATCT 

+ + + + + + 

TTTGGGTGTCGGACCC ATCGTCCTCCTGG AACTACGAGGACCGTGTCTACTCCTCTTAGA 

ThrHis SerLeuGly SerArg ArgThrLeuMet LeuLeuAlaGlnMe tArgArglieSer 



C TCTTTTCTCCTGCTTGAAGCACAGACATGACTTTGGATTXCCCCAGG AGGAGTTTGGCA 

+ + + + + + 

G AG AAAAGAGG ACG AACTTCCTGTCTGTACTGAAACCTAAAGGGCTCCTCCTC AAACCGT 

LeuPheSerCysLeuLysAspArgHi sAspPheGly Phe Pr oGlnGlubluPheGly Asn 



ACCAGTTCC AAAAGGCTGAAACCATCCCTGTCCTCCATGAG ATGATCCAGC AGATCTTCA 

- + „. + + + ♦ 

TGGTC AAGGTTTTCCGACTTTGGTAGGGACAGGAGGTACTCTACTAGGTCGTCTAGAAGT 

2220 

GlnPheGlnLysAlaGluThrlleProValLeuHisGluMetllcGlnGlnllePheAsn 

A 



FIG. 11 sheet 3 
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2221 

ATCTCTTC AGC AC AAACGACTCATCTGCTGCTTGGG ATG AGACCCTCCTAGACAAATTCT 

+ , + + + + + 

TAGAG AAGTCGTGTTTCCTGAGTAGACGACG AACCCTACTCTGGG AGGATCTGTTT AAGA 

LeuPheSerThrLysAspSer Ser AlaAlaTrp AspGluThrLeuLeuAspLysPheTyr 



AC AC TGAACTCT AC C AGC AGC TG A AT GACCTGGA AGC CTGTGTG AT AC AGG GGGTGGGGG 
+ _ + + + + + 

TGTGACTTGAGATGGTCGTCGACTTACTGGACCTTCGGACACACTATGTCCCCCACCCCC 
ThrGluLeuTyrClnClnLeuAsnAspLeuGluAlaCysVallleGlnGlyValGly Val 



TG ACAGAGACTCCCCTGATGAAGG AGG ACTCC ATTCTGGCTGTGAGG AAATACTTCC AAA 

+ + + + + + 

ACTGTCTCTG AGGGG ACT AC TTCCTCCTG AGG TAAGACCG AC AC TCCTTTATGAAGGTTT 

ThrGluThrProLeuMetLysGluAspSerlleLeuAlaValArgLysTyrPheGlnArg 



G AATC ACTCTCTATCTGAAAG AGAAG AAATACAGCCCTTGTGCCTGGGAGGTTGTCAG AG 

+ + + + -~+ + 

CTTAGTGAGAG AT AGACTTTCTCTTCTTTATGTCGGGAACACGG ACCCTCC AAC AGTCTC 

IleThrLeuTyrLeuLysGluLysLysTyrSerProCysAlaTrpGluValValArgAla 



60 



CAGAAATCATGAGATCTTTTTCTTTGTCAACAAACTTGC AAGAAAGTTTAAGAAGTAAGG 

+ + + + + + 

GTCTTTAGTACTCTAGAAAAAGAAACACTTGTTTGAACGTTCTTTC AAATTCTTC ATTCC 

2520 

G lull eMe c ArgSe r Phe Ser Leu SerThrAsnLeuGlnGlu Ser Leu Arg Ser LysGla 

FIG. 11 sheet 4 
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2521 

AATC AAAACTGGTTCAACATGGAAATGATTTTCATTGATTCGTATGCCACCTC ACCTTTT 

+ + + + + 

TTACTTTTGACC AAGTTGTACCTTTACT AAAAGTAACT AAGC ATACGGTCGAGTGGAAAA 
End 



TATCATCTGCC ATTTCAAAGACTCATGTTTCTGCTATGACCATGAC ACGATTTAAATCTT 
-+ ♦ + + + + 

ATACTAG ACGGTAAAGTTTCTGAGTACAAAGACG ATACTGCTACTGTGCTAAATTTAGAA 



TTC AAATGTTTTTAGGAGTATTAATC AACATTGTATTCAGCTCTTAAGGC ACTAGTCCCT 
+ + + + + + 

AAGTTTACAAAAATCCTC ATAATTAGTTGTAACATAAGTCG AGAATTCCGTGATC AGGGA 



TACAGAGGACCATGCTGACTGATCCATTATCTATTTAAATATTTTTAAAATATTATTTAT 
+ + + + + + 

ATGTCTCCTGGTACGAC XG ACTAGGTAATAG ATAAATTTATAAAAATTTTATAATAAATA 



TTAACTATTTAT AAAACAACTTATTTTTGTTCATATTATGTC ATGTGCACCTTTGCAC AG 
+ + + + + + 

AATTGATAAATATTTTGTTGAATAAAAACAAGTATAATAC AGTACACGTGGAAACGTGTC 



TCGTTAATGTAATAAAATATCTTCTTTGTATTTGGTAAAAAAAAAAAAAAAAAAAAAAAA 
+ + + + + + 

ACC AATTACATTATTTTATACAAGAAAC ATAAACCATTTTTTTTTTTTrTTTTTTTTTTT 

EcoRI 

AAAAAAAAAAAACCGGATCATCGAGCTCGAATTC 

+ ♦ + 

TTTTTTTTTTTTGGCCTAGTAGCTCGAGCTTAAG 

2914 
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EcoRI 

GAATTCAAGCTAGATGCTAAGCGATATTGC ATGGCAATATCTGTTGATGCATGTGCTTCT 
+ + + + + + 

CTTAAGTTCGATCTACGATTCGCTATAACGTACCGTXATACAC AACTACGTACACGAAG A 



TCCTTCAGCTTCCCCTCGTGCAGATG AAGGTTTGGCTATAAATTGAAGTGGTTGGTCGGG 
+. + + + - + + 

AGG AAGTCG AAGCGG AGCACCTCTACTTCCAAACCGATATTTAACTTCACC AACCAGCCC 



GTTCCGTGAGGGGCTGAAGTGCTTCCTCCCTTTTAGACGCAACTGAGAGCCTGAGCTTCA 
+ + - — + + + + 

C AAGGC ACTCCCCGACTTC ACGAAGCAGGGAAAATCTGCGTTG ACTCTCGGACTCGAAGT 

TC CCC AGCATC ATTACACC TC AGCA'ATGTCGTTCCG ATCTCTACTCGCCCTGAGCGGCCT 

+ ♦ + + + + 

AG GG CTCGT ACT AATG TGGAGTCGTT AC AGC AAGGC TAGAG ATG AGCGGGACTCGCCGG A 

Met SerPhe Arg.Ser Leu Leu AlaLeuSe rClyLeu 



Bam Hly/Bgl II fusion 

w / 30 f m 

cgtctgcacagcgttcgcaaatgtgatttccaagcgc % ccaaagatccag. Dgaattc 

+ +- -+ + + 

gcagacgtgtcccaaccctttacactaaaggttcccgcgtttctagctc. . .CTTAAC 

ValCysThrGlyLeuAlaAsnVallleSerLysArgAlaLysIleCln 

FIG.13 
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1381 

C TCGGTGGC AGATCTGCG ATCGTCCATAACCGTTCAGATGTTG ATT GGAAC TGGGTGGGG 
+ + + + + + 

G AGGC ACCGTC TAG ACGC TAGC AGGT ATTGGC AAGTCTAC AACTAACCTTGACCC ACCCC 



TACACAGCTCCGAAG ACCCAGTGAACGTATACCTAAG ACACTTTGACACGGCCGG AACAC 

+ + + + + + 

ATCTGTCG AGGC TTCTGGCTCACTTGC ATATGG ATTCTGTGAAACTGTGCCGGCCTTGTG 



TGTAAGTCCCTTCGTATTTCTCCGCCTGTGTGGAGCTACC ATCC AATAACCCCC AGCTGA 
+ + + + + + 

ACATTCAGGGAAGC ATAAAG AGGCGGAC AC ACCTCGATGGTAGGTTATTGGGGGTCGACT 

AAAAGCTGATTGTCGATAGTTGTGATAGTTCCCACTTGTCCGTCCGC ATCGGC ATCCGCA 
♦ + + + + + 

TTTTCG AC TAACAGC TATC AAC ACTATCAAGGGTG AACAGGCAGGCGTAGCCGTAGGCGT 



GCTCGGG AT ACT TCCGACCT AGG ATTGG AT GCATGCGG A ACCGCACgAGGGCGGGGCGG A 

+ + + + H + 

CGAGCCCTATCAAGGCTGGATCCTAACCTACGTACGCCTTGGCGTGcTCCCGCCCCGCCT 



AATTGAC AC ACCACTCCTCTCC ACGCAgCCGTTCAAGAGGTACGCGTATAGAGCCGTATA 
+ + + + + + 

TTAACTGTGTGGTGAGGAGAGGTGCGTcGGCAAGTTCTCCATGCGC ATATCTCGGC AT AT 

17^0 

FIG. 1 5 sheet 1 
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\7L} 

GAGCAGAGACGGAGCACTTTCTGGTACTGTCCGCACGGGATGTCCGC ACGG AGAGCCAC A 
+ + + + + + 

CTCGTCTCTGCCTCGTGAAAG ACCATG ACAGGCGTGCCCTACAGGCGTGCCTCTCGGTGT 



AACGACCGGGGCCCCGTACGTGCTCTCCTACCCCAGGATCGCATCCTCGCATAGCTGAAC 
• + _ + + « + + ♦ 

TTGCTCGCCCCGGGGC ATGC ACC AG AGG ATGGGGTCCTAGCGTAGG AGCGTATCGAC TTG 



ATCTATATAAAGACCCCCA AGGTTCTC AGTCTCACC AAC ATCATG AACC AACAATC AAC A 
+ + + + + + 

TAGATATATTTCTGGGCGTTCC AAGAGTC AGAGTGGTTGTAGTAGTTGGTTGTTAGTTGT 



Sail 68 
/ — ^— v — — ' \ 

GGGTCGACATGTACCGGTTCCTCGCCGTCATCTCGGCCTTCCTCGCCACTGCCTTCGCCA 

-+ + + + + 

CCC AGCTGTAC ATGGCC AAGG AGCGGCAGTAGAGCCGG AAGG AGCGGTGACGG AAGCGGT 

1980 

MetTyrArgPheLeuAlaVallleSerAlaPheLeuAlaThr AlaPhe AlaLys 



EcoRI 30 
1981 ^JL, , L 

AGTCTAGAGG ATCCCCGGGCGAGCTCG AATTCCCCGGG ATCC AG 

*— **■ — — — — — — — — — — — — — — — — — a#a##ftSAK#a#A> « 

TCACATCTCCTAGGGGCCCGCTCG AGCTTAAGGGCCCCTAGGTC 

Ser ArgGlySerProGlyGluLeuGluPheProGly IieGln endoglucanase coding 



EcoRI 

CCCGGGCGAGCTCGAATTC 
GGGCCCGCTCGAGCTTAAC 

FIG. 15 sheet 2 
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351 

ACCTTCTCAGTCTCACCAACATCATCAACCAACAATCAACACCCTCCACTCTACACGATC 
TCCAAGACTC AGAGTGGTTCrACTAGTTCCTTGTTAGTTGTCCCAGCTCACATCTCCTAG 

22 EcoRl 
J 7, 

CCCGGCCGAGC'iCCAATTCCCCGGATCCG TCGACCTGC AGGGGGGGGGGGGGTCTTCTCC 
GGGC CCGCTCGAGCTTAAGGGGCCTAGGCAGCTGGACGTCCCCCCCCCCCCCAG AAGAGG 

76 

, / 

ACCGTCCTCTTGCAGAGCACACACAGAGCTGAAGACGATGGCG AAC AAACATCTGTCCCT 
+ + + + + + 

TGGCAGG AG AACGTC TCG TGTGTGTCTCG AC TTCTGCTACCGC TTGTTTGT AG AC AC GG A 

Me c Ala AsnLy s Hi sLeuSerLeu 



/ 6 _/ 78 

CTCGCTCTTCCTCGTCCTCCTTGCCCTGTCGGCATCTCTACCTICCGGCCAA 

H -i + + + — 592 

GAGCGACAAGGAGCAGCAGGAACCCCACAGCCGTAGAGATCGAAGGCCGGTT 592 
SerLeuPheLeuValLeuLeuGlyLeuSerAlaSerLeuAlaSerClyGln 



FIG.17 



43 



This Page is inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defective images within this document are accurate representations of the 
original documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 
ft BLACK BORDERS 

^ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 
^ FADED TEXT OR DRAWING 

□ BLURED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

Jgf COLORED OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS ' 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REPERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning documents will not correct images 
problems checked, please do not report the 
problems to the IFW Image Problem Mailbox 



• 



